Big Data
Three Challenges
- Volume - the size of the data.
- Velocity - the latency of data processing relative to the growing demand for interactivity
- Variety - the diversity of sources, formats, quality, structures. Integrate
Big Data is any data that is expensive to manage and hard to extract value from.
In Big data, the Big is relative, it is not necessary to be big. Sometimes, difficult data is perhaps what big data really means. It is not so much really big, it is about being challenging.
Big Data History
The earliest notion - 1989 :
The keepers of big data say they do it for the consumer's benefit. But data have a way of being used for purposes other than originally intended.
E-commerce, in particular, has exploded data management challenges along three dimensions: volumes, velocity and variety.
On Volume
The lower cost of e-channels enables and enterprise to offer its goods or services to more individuals or trading partners, and up to 10x the quantity of data about an individual transaction may be collected - thereby increasing the overall volume of data to be managed.
On Velocity:
E-commerce has also increased point-of-interaction (POI) speed, and consequently the pace data used to support interactions and generated by interactions.
On Variety:
Through 2003/4, no greater barrier to effective data management will exist than the variety of incompatible data formats, non-aligned data structures, and inconsistent data semantics.
Big Data ... and the Next Wave of InfraStress
Disk capacities growing incredibly fast, disk latencies not keeping pace.
Big Data Now
... the necessity of grappling with Big Data, and the desirability of unlocking the information hidden within it, is now a key theme in all the sciences - arguably the key scientific theme of our times.
Where does big data come from?
- "Data exhaust" from customers - actually tracking a lot information about what customers do.
- New and pervasive sensors - the availability of new and pervasive sensors. We are actually able to get visibility on data sources that we previously couldn't
- The ability to "keep everything" - The capacities of disk has gone up and the cost of per storing a byte has gone down, and we sort of have ability to keep everything.
Examples
Car black boxes
More and more new cars will be equipped with these black boxes that are a lot like what's going on inside airliners have. The reason is for forensics in the event of a crash, but they also record a lot of other information. Ans so insurance companies have similar devices that you can opt in, you can voluntarily plug in to reduce your insurance rates. That track your speed, track other kinds of aspects of your driving habits. So this technology would have been simply hard to imagine in 20 or 30 years ago.
HydroSense
HydroSense is a pressure-based sensor that automatically determines water usages activity and flow down to the source (e.g., diskwasher, laundry, shower) from a single non-intrusive installation point.
ElectriSense
ElectriSense is a single plug-in sensor that provides whole home device level usage data. That is, using a single sensor plugged in anywhere in the home, ElectriSense can infer which electrical appliances are on and which off. This data could be used for numerous applications, for example, for providing home owners with itemized electrical bill that not only shows the total energy consumption but breaks the total on a per appliance basis (TV consumed 20 KWh, Lighting consumes 18 KWh and so on).
Latest Post
- Dependency injection
- Directives and Pipes
- Data binding
- HTTP Get vs. Post
- Node.js is everywhere
- MongoDB root user
- Combine JavaScript and CSS
- Inline Small JavaScript and CSS
- Minify JavaScript and CSS
- Defer Parsing of JavaScript
- Prefer Async Script Loading
- Components, Bootstrap and DOM
- What is HEAD in git?
- Show the changes in Git.
- What is AngularJS 2?
- Confidence Interval for a Population Mean
- Accuracy vs. Precision
- Sampling Distribution
- Working with the Normal Distribution
- Standardized score - Z score
- Percentile
- Evaluating the Normal Distribution
- What is Nodejs? Advantages and disadvantage?
- How do I debug Nodejs applications?
- Sync directory search using fs.readdirSync