Executive SummaryBigData is a trending technology in this age. Various organizations are slowlyrealizing the true potential of evaluating various unused stored data and thatby utilizing the analysis they can get ahead in the competition.
Among theother types of Big Data Analytics, Real-Time Analytics is certainly the fastestof them all as data is analyzed within a fraction of a second. Organizationssuch as IBM in collaboration with Kinetics are researching and developing newsystems that would take less time to analyze huge amount of data thantraditional real-time Table of ContentsIntroduction. 4Discussion. 4Utilizationof Real-Time Data Analytics. 4Applicationsof Real-Time Data Analytics. 5Data StorageInfrastructures.
6CurrentResearch and Development in Real-Time Data Analytics. 7IBM andKinetica. 7PlanetSense.
8Conclusion. 8ReferenceList 9 IntroductionBig Data is a huge collection of data sets that is very complex toanalyse through traditional data processing and analysis tools and methods (John 2014).Organizations are applying big data to get accurate market informationpertaining to their market competition, share market analysis, games and mapsthat handles data in real-time and many more (Kelly et al.
2014). The purpose of this report is to understand the useof real-time data analytics in Big Data processing. This report elaborates onstream data analytics that is used for the analysis of real-time data and thecurrent research and development in this type of analytics.DiscussionCollecting data and analysing it is rising exponentially. Organizationshave been collecting vast amount of data every second but only recently thatthey have understood the true value of the data that they have collected andhow it can be used to boost their business approach.
All the stored data needsto be effectively mined that will help predict the future (Larose 2014.).Real-time data analytics thus transforms a company’s reactive problem handlingapproach to automated real-time learning environments.Utilization of Real-Time Data AnalyticsReal-time data analytics processes data as soon as it enters the system(Hatley and Pirbhai 2013). The data is processes and a feedback is given to theuser in milliseconds. Real-time Interactive Analytics is used to supportqueries that are interactive in nature.
The information is indexed for quickaccess and thus response to such queries becomes fast. Tools like Apache drill,Druid, VoltDB and SAP Hana stores all the indexed data in memory to make theprocess very fast.Real-time data analytics is also used where the queries are fixes orstatic but the solution needs to be given in real-time that is withinmilliseconds. Such instances include online game servers where multiple usersare interacting with each other in real-time.
The queries are thus fixed as theplayers are in a repetitive environment.Some of the stream processing systems include Apache Storm and ApacheSamza. Real-time football analytics is an example of stream processinganalytics (Stensland et al. 2014).Stream processing refers to the style of data inflow.
Here small quantities ofdata are recorded and processed in the system every millisecond over theduration of the football match. This helps in analysing the match and keep atrack on the match and its players at the same time. Applications of Real-Time Data AnalyticsThe applications of real-time data analytics is growing every day. Othercommon applications are:· Customer Relations Management (CRM)is one of the primary sectors where this analysis is used. Real-time dataanalytics can be used in CRM to provide ‘up to the last minute’ updatedinformation about a customer to an enterprise (Khodakaramiand Chan 2014).
Withgood infrastructure, analysis can be provided on a shared information withinseconds of a customer interaction. · Corporate dashboards in anotherplace where real-time analytics can be used to display the most updatedinformation that reflects the changes to their business at day’s end. · A data warehouse is combination ofhardware and software resources that are specifically designed to process data(Kimball and Ross 2013).
Using a data warehouse, real-time dataanalytics is able to provide support for analysis and processing of queriesthat are ad hoc and unpredictable.· Analysis of scientific data can alsobe performed through this form of analytics. For example, data can be collectedon the path, wind field and the intensity of a hurricane and then this data canbe used to predict the hurricane’s movement in advance.Data Storage InfrastructuresThere are two big datastorage options, Premise Storage and Cloud Storage. A comparison between thetwo is provided as follows: On Premise Storage Cloud Storage · Hadoop Distributed File System (HDFS) is primarily used for on premise big data storage (Hildmann and Kao 2014). · The first advantage to using HDFS is data can be stored in heterogeneous types of storage that contains some spinning disks and some SSD type of storage (Song, Park and Jeong 2016.
) · These storage devices can be either independent or attached . · The second advantage is that the end-to-end encryption is transparent. This means that the speed of storing and retrieving data is high. · One disadvantage to using HDFS would be that it works best for storing and processing of data in a single data centre. WAN connection is absent and thus data is neither globally stored nor accessed. · A huge problem occurs in data recovery in case of a disaster in the data centre.
This disadvantage makes the use of on premise storage inefficient and prone to errors. · The best options for storing data in the cloud are using popular Object Stores such as S3, Google Cloud Storage and Azure WSDL/ADLS (Jamshidi et al. 2015).
· The advantage to using cloud storage is that data can be stored and analysed from any place. · There are various capabilities of Object Stores that can be utilized to get the optimum result for different situations. S3 Standard, S3 Reduced Redundancy Storage, S3 Standard Infrequent Access and Glacier. Each of these serve different and the storage option should be chosen depending on the user’s purposes. · On disadvantage of Cloud Object Stores would be the inconsistence of objects.
The user needs to special care to ensure that they have very predictable data pipeline while storing data in the cloud storage system. · Operating across multiple data centres can cause inconsistencies in data replication. Current Research and Development in Real-Time Data AnalyticsThere has been various research and development in the field of dataanalytics. IBM and KineticaThe most notable one is that IBM and Kinetica developed a proactive GPUaccelerated database that runs on OpenPOWER LC servers of IBM (Hater et al.
2016). The in-database analyticsof Kinetica uses Artificial Intelligence and Business Intelligence workloadsthat is applied on a single database platform. Kinetics has developed fasteranalytics by utilizing Graphics Processing Unit (GPU). This enables it tohandle huge dataset containing of multi-billion rows, in milliseconds. Theindexed database is provided in the memory with location-based analytics thatprovides Natural language processing, database operations that areGPU-accelerated and Native Geographic Information System and Internet Protocoladdress object support.
The system features deep integration with leading opensource frameworks such as Apache Spark, Hadoop, Accumulo, H2O Wireless andNifi. The system claims to be hundred times faster than any traditional or legacy in-memory databases. Thus, advancedanalytics can be calculated in a period of less than a second at a fraction ofthe cost. IBM’s OpenPOWER LC servers and Kinetica provides solutions forreal-time problems, manages the workload level efficiency and the scalabilityfactor at the data centre.
PlanetSensePlanetSense is a real-time streaming platform with spatial-temporalanalytics for collecting geo-spatial information from various open sources ofdata (Thakur et al. 2015). Theplatform consists of four main components:· GeoData Cloud is a data architecture that serves the purpose ofstoring as well as managing different datasets.
· Built-in mechanism for real-timedata mining.· A superb data analytics framework. · The data can be presented andvisualized through the web interface and Representational State Transfer that is alsoknown as RESTful services.
Conclusion Thus, it can be concluded that bigdata is a revolutionary technology in the field of information and dataanalysis Real-time analytics serves a huge field of applications ranging fromgames to customer support. Natural disasters like hurricanes and typhoons canbe predicted and thus millions of lives can be saved. Organizations arespending billions in the research and development Big Data Analytics to get thelead in the market.Reference ListHater, T., Anlauf, B.
, Baumeister, P., Bühler,M., Kraus, J. and Pleiter, D., 2016, June. Exploring Energy Efficiency forGPU-Accelerated POWER Servers.
In International Conference on HighPerformance Computing (pp. 207-227). Springer International Publishing.Hatley, D. and Pirbhai, I., 2013.
Strategiesfor real-time system specification. Addison-Wesley.Hildmann,T. and Kao, O.
, 2014, June. Deploying and extending on-premise cloud storagebased on ownCloud. In Distributed Computing Systems Workshops (ICDCSW), 2014IEEE 34th International Conference on (pp. 76-81). IEEE.Jamshidi,P.
, Pahl, C., Chinenyeze, S. and Liu, X., 2015.
Cloud migration patterns: amulti-cloud service architecture perspective. In Service-OrientedComputing-ICSOC 2014 Workshops (pp. 6-19).
Springer, Cham.JohnWalker, S., 2014.
Big data: A revolution that will transform how we live, work,and think.Kelly, M.F., Kelly, B.
M., Petermeier, N.B.
,Kroeckel, J.G. and Link, J.E.
, Agincourt Gaming, Llc, 2014. Method forproviding games over a wide area network. U.S. Patent 8,821,258.Khodakarami,F.
and Chan, Y.E., 2014. Exploring the role of customer relationship management(CRM) systems in customer knowledge creation.
Information & Management,51(1), pp.27-42.Kimball,R. and Ross, M.
, 2013. The data warehouse toolkit: The definitive guide todimensional modeling. John Wiley & Sons.Larose, D.T., 2014.
Discovering knowledge indata: an introduction to data mining. John Wiley & Sons.Song, S.
S., Park, S.H.
and Jeong, K.H., SamsungElectronics Co., Ltd., 2016.
Solid-state drive. U.S.
Patent Application15/147,922.Stensland, H.K., Gaddam, V.
R., Tennøe, M., Helgedagsrud,E.
, Næss, M., Alstad, H.K.
, Mortensen, A., Langseth, R., Ljødal, S., Landsverk,Ø.
and Griwodz, C., 2014. Bagadus: An integrated real-time system for socceranalytics. ACM Transactions on Multimedia Computing, Communications, andApplications (TOMM), 10(1s), p.
14.Thakur,G.S., Bhaduri, B.L., Piburn, J.O., Sims, K.M., Stewart, R.N. and Urban, M.L.,2015, November. PlanetSense: a real-time streaming and spatio-temporalanalytics platform for gathering geo-spatial intelligence from open source data.In Proceedings of the 23rd SIGSPATIAL International Conference on Advancesin Geographic Information Systems (p. 11). ACM.