Big Data

Wikipedia: Big data

Slides: Big Data

Data produced every two days is the equivalent of all data available up to 2003 — Google’s Schmidt
Global data volume doubles every two years
90% of data in the world was produced in the last two years
Is There Hope for Small Firms, the Have-Nots in the World of Big Data?


Data Sources

  • Sensors: Internet of Things (cameras in metro areas, structures. 200+ sensors in cars)
  • Connected devices
  • Social Media, data exhaust (Free space)
  • Clickstream data
  • Geo data
  • Loyalty cards
  • Transactional data
  • Structured versus unstructured data

How can you determine who would be most popular in the lead role in 50 Shades ?

Three aspects to Big Data

  1. Storage
  2. Analytics
  3. Visuualization

Attributes of Big data

  • Volume
  • Velocity (Fast data, actionable now)
  • Variety
  • Veracity

Changes in analysis



  • Algorithm designed from the data and other sources. Then individual data mapped against the algorithm(s) to present results. Adapt accordingly. It’s a repeating set of rules.

  • Algorithmic business model based on Big data analytics: “if this, then that …”
  • Algorithms predict the future, based on Big data (the past)
  • Algorithms are a set of rules, which are created by understanding a business’ data (big data analysis) and from external sources.
  • Input data is unique, fixed rules defined by algorithm(s), output unique
  • Replaces human intuition

The Filter Bubble: Algorithm vs. Curator & the Value of Serendipity
What determines your filter bubble?

Machine Learning

  • Increasing returns on data, more data, smarter results

  • Google (Hummingbird), Amazon, Facebook, Netflix, Walmart, Siri, etc.

Old World, New World

  • Old world, Analogue: same content, by vehicle / medium / retail store etc.
  • New world, Digital: content driven by algorithms, for example:

Examples of algorithmic driven content to increase engagement and lock-in:

Thought: Fake tweet after AP was hacked (Obama injured) causing an immediate 140 dip in the stock market, before it recovered as quickly: False White House explosion tweet rattles market (Veracity)

Question: How can a mobile app augment old world (shop) with new world (content specific to the individual shopper?)): Dumb retail to Smart retail environment

Additonal Trends

  • Moore’s Law, cost of processing
  • Cost of Storage
  • Cloud Services: variable cost versus fixed cost investment
  • Syndicated data (Neilsen scanner data)
  • growth in Software as a Service (SaaS) Industry providing Big Data solutions
  • Continuous Analytics
  • Predictive Analytics
  • Sentiment Analytics

Additional issues

  • Primary data collection versus secondary use (sometimes unimagined)
  • Additional Uses
    • design better offers
    • target better consumers
    • listen
    • innovation
    • increased agility
    • first degree price descrimination: willingness to pay

Cloud Computing

Cloud Computing

Google App Engine

Amazon Web Services

Windows Azure

Software as a Service



‘Worst breach in history’ puts data-security pressure on retail industry: need to move to EMV Standard (high switching costs)