In-Memory analytics is an advanced business intelligence methodology that is coming up and will certainly change analytics in the near future. Something that is in-memory corresponds to some sort of data that resides in the RAM (Random access Memory) as compared to data that is stored in a disk. When the CPU does data processing, the data must be in-memory. If not, the data has to be brought to in-memory making the process slower. Thus, in-memory computing is a much faster process, especially when dealing with chunks of data.
An analogy to understanding what In-Memory analytics is all about can be thinking about how our brain stores a million pieces of information every single day. This information is stored in an organized way inside the brain. It is organized in the form of nodes that represent concepts and the nodes are linked based on their relatedness. It is as if this information is stored in folders with different files inside them tagged with different captions. Now retrieving this information takes time since it requires us to go to that particular folder, get the required file and read it.
In-Memory Analytics does something similar. It involves taking the storage space and making it bigger, so that everything is instantly available, available in real time, reducing delay and helping to remove latency. Compared to the traditional databases, where 90% of the data was stored in disk-based databases and only 10% was stored in memory, the In-Memory database will store 90% of its data in memory reducing the time majorly, especially when applications have to access data to add additional features including real time information.
With the recent fall in RAM prices and the entry of the 64 bit operating systems, manufacturers have been able to manufacture machines with even few Terabytes of storage memory, which can go on to hold memory as huge as an entire data warehouse. This means that the traditional methods of storing data in tables on the disks and indexing or optimizing queries like before will not be further required. Even sophisticated models can be executed in near real-time and business intelligence applications will become dramatically more flexible as these aggregated “cubes” will no longer be necessary. In-memory computing, is also, much better than the caching technique, which is a widely used method of speeding up query performance.
The business intelligence has two main parts: Reporting and Analytics. Analytics is the more difficult part which includes analysing huge datasets, doing computations and comparing between the data in order to understand shopping patterns, usage patterns etc. The quickness of this analysis depends on the quickness to accessibility and computation of these datasets. In simpler terms, it depends on how quickly we can get to these datasets and how quickly they can be assessed or evaluated. In traditional data warehouse method, data is stored in warehouse and part of data is extracted from data marts for pre-analysis and pre-aggregation. The results stored in data mart are used by business intelligence applications for reporting. But in In-house computing analytics, business intelligence applications do not access partial data, rather the results are computed real-time by querying the built-in calculation engine. For IMC approach, data is frequently updated and it provides proper visualization of the results queried.
A point where the in-memory computing revolutionizes analytics, since the entire data would be available in-memory as compared to a time-consuming disk fetch.
Consider a scenario where an entire report is presented to the user and the entire data needs to be made available so that the user can drill through the data using filters. For a conventional Business Intelligence tool, it takes a huge amount of time to query the entire data from the database and they normally don’t do this. Instead, they pull the data at a high level and fetch the detailed data only when the user drills down. Each time the user tries to fetch data, it involves a query, consuming a lot of time. However, in-memory Business Intelligence computations makes the entire data available right in the beginning. The speed at which the data loads as well as the speed at which every further drill down happens, happens at the speed of thought as no disk queries are required.
This form of quick interactive analysis is really necessary, especially when the user is trying to figure out unknown patterns or discover new opportunities. With conventional reporting tools, users will have to stick to specific questions, such as “What are the sales for this quarter?” However, with in-memory visualization tools, this question can be less specific and analytic, asking to “Show the data and the patterns involved in the sales for this quarter.”
The ability to perform ‘what-if analysis’ on the go is another feature of the in-memory tools. For example, users would want to know how the profits of this quarter would change if the prices of several items were increased, transit times reduced etc. A conventional OLAP (online analytical processing) tool would require the entire database to be recalculated overnight whereas using an in-memory technology, the results are immediately available and held in memory.
Many of the popular Analytics tools out there like QlikView, Tableau, TIBCO Spotfire, SAP HANA, Oracle Exalytics makes use of in-memory processing to varying degrees. Though the covering required for in-memory technology is expensive and the business intelligence applications that can use this speed are limited in number, this method of computing should be definitely looked upon as an industry trend. It will not just help companies to gain the speed but will also help come up with deeper insights to more granular levels, needed to increase revenue. Data scientists will not be restricted to a sample; they will have the liberty to apply as many analytical techniques and iterations as needed to find the best model and continue exploiting the potential of the vast memory available with the present day computers.