Wednesday 11 December 2013
Rick van der Lans is an independent consultant, speaker and author who specialises in Business Intelligence and data warehousing. He came across CXAIR through our Dutch partners Systemation and wrote a great blog post reflecting his views on the product. So naturally we wanted to share this with you…
Exploring and Investigating Data with CXAIR
“Where a data scientist or analyst will find an answer to his quest, is not always that obvious beforehand. For example, when he is looking for the dominant factor influencing sales of particular products, when he tries to find the way to increase the customer care level, or when he tries to establish what the risk level is when car insurances are sold to young people, he may not have any idea what the answer may be. He may not know which data sets are needed to come up with an answer, or which data items he has to study upfront.
Therefore, he needs tools that allow him to freely explore and investigate data. Incorporating more data sets in the analysis should be very easy, there should be no need to specify a goal beforehand, and it must be possible to analyze data in an unguided way.
Besides all the more standard features such as displaying data as bar charts, in dashboards, on geographical maps, the perfect tool for this type of work should support at least the following characteristics:
- No advance preparations: There should be no need to predefine data structures of the analysis work in advance. If data is available, it should be possible to load it without any preparations, even if it concerns a new type of data.
- Unguided analysis: Analysts should be able to invoke the analysis technology without having to specify a goal in advance. The exploration technology should allow for analyzing data in an unguided style.
- Self-service: Analysts should be able to use the analysis techniques without help from IT experts.
Connexica’s analysis tool called CXAIR is such a tool. It’s natural language interface, venn-diagramming techniques, and visualization features allow users to freely query and analyze data. No cubes or star schemas have to be defined on forehand (which would limit the analysis capabilities).
CXAIR internally organizes all the data using an intelligent index. In fact, internally it’s based on text-search technology. This makes it possible to combine and relate data without any form of restriction, which is what analysts need.
Unlike most analysis tools, CXAIR uses search technology that speeds up data analysis. For calculations a mixture of in-memory and on-disk caching is used to analyse massive amounts of data at search engine speeds. All the loaded data resides on the server as it provides a thin client web interface. In other words, no data is loaded on the client machine. Numbers are cached on the server but not text. The fact that CXAIR doesn’t cache all the data means that available memory is not a restriction. Cache is used to improve the performance, but large internal memory is not a necessity but will help performance, particularly for ad-hoc calculations.
CXAIR is clearly a representative of a new generation of reporting/analysis tools that users can deploy to freely analyze data. It’s a tool for self-service discovery and investigation of data. It’s the tool that many data scientists have been waiting for and is worth checking out.”