Big Data - General Introduction

Big data (also spelled Big Data) is a general term used to describe the voluminous amount of unstructured and semi-structured data a company creates -- data that would take too much time and cost too much money to load into a relational database for analysis. Although Big data doesn't refer to any specific quantity, the term is often used when speaking about petabytes and exabytes of data.

A primary goal for looking at big data is to discover repeatable business patterns. It's generally accepted that unstructured data, most of it located in text files, accounts for at least 80% of an organization's data. If left unmanaged, the sheer volume of unstructured data that's generated each year within an enterprise can be costly in terms of storage. Unmanaged data can also pose a liability if information cannot be located in the event of a compliance audit or lawsuit.

Big data analytics is often associated with cloud computing because the analysis of large data sets in real-time requires a framework like MapReduce to distribute the work among tens, hundreds or even thousands of computers.

The session given an introduction to Big Data. Starts with giving a generally accepted definition of the term Big Data. Then explores why Big Data is important in the current business scenario . The topic ends with enumeration of the technologies used to analyze Big Data like Map Reduce, NoSql etc.

5-7 key questions (non-generic) which would be covered in the session
(i) What is Big Data
(ii) Why is Big Data important in the current business scenario
(iii) How can an organization effectively use Big Data
(iv) What are the important technologies used to analyze Big Data?
(v) What are MapReduce/Hadoop/HDFS/NoSQL technologies?"

  • Aucune note. Soyez le premier à attribuer une note !

Ajouter un commentaire


7 choses à savoir si Tu débutes en automatisme...

7 choses que tu dois savoir si tu debutes en automatismeCliquez ici pour télécharger le guide PDF

Superv 3