HR and Big Data: Long Live Long Data!

In an era where HR is finally emerging from the dark ages and frantically struggling for a seat on the bandwagon of data-conscious, data-driven business functions. In an era where HR is yet to come to terms with the data value chain, where data appreciation is the exception, data visualisation is still a dark art, data culture is practically non-existent and BI stands for Brain Injury.

Should we really be talking about Big Data in HR?

Where Do We Begin?

I recently spoke at an international HRTech conference on the topic of Talent Analytics and the drive towards a more proactive, predictive and business-driven HR function. Whilst there, and as I had expected, I was bombarded with numerous presentations and pitches about Big Data, its mainstream HR applications and implications and how it is seemingly transforming HR, as we speak!!

Being an HR professional for over 23 years, and originally an engineer by training, I am quite familiar with Big Data and its real world applicability within quite singular and unique environments; environments that typically generate terabytes upon petabytes of very fast and varied data that, more often than not, requires real-time analysis to generate worthwhile insight. Barring “questionable” futuristic sci-fi trends (e.g. wearable technologies and real-time sensors in the workplace), I have yet to witness the practical applicability of Big Data in HR. It is, in my opinion, just another hyped up fad.

Where Are We Now?

There is no denying, however, that Big Data (and its big applications) has tremendous significance in a multitude of relevant contexts. It carries growing significance to almost all “high-tech” industries and businesses; industries such as the Airline industry, Retail shopping, Medicare, Manufacturing, Meteorology, Oil & Gas, AI, Seismology, Marketing, Sales, etc. are becoming increasingly reliant on its potent potential. Note, however, that these are industries where data comes in really “Big”, i.e. at vast Volume, great Velocity and amazing Variety (the 3V’s). It is then collected, cleansed, analysed, interpreted and used to generate insightful knowledge, usually in REAL-TIME, and for real-time application. Recognise HR, yet?

Where Are We Heading?

In our quest to position Big Data within the HR framework, we need to take a couple of steps back and agree on the basic definitions. What is Big Data?

HR professionals need to truly understand, and appreciate, the technical connotations of Big Data. According to Gartner: “Big data is high Volume, high Velocity, and/or high Variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.” Additionally, a fourth V “Veracity” has been added by some organizations to further quantify Big Data (making it now the 4V’s model).


Most data collected by organizations used to be transactional data that could easily fit into rows and columns of relational database management systems; e.g. typical HR data. We are now witnessing an explosion of data from Web traffic, e-mail messages, and social media content (tweets, status messages), as well as machine-generated data from sensors (used in smart meters, manufacturing sensors, and electrical meters) or from electronic trading systems. This data may be unstructured or semi-structured and thus not suitable for relational database products that organize data in the form of columns and rows. We now use the term big data to describe these datasets with volumes so huge that they are beyond the ability of typical DBMS to capture, store, and analyze.

Big data doesn’t refer to any specific quantity, but usually refers to data in the petabyte and exabyte range—in other words, billions to trillions of records, all from different sources. Big data is produced in much larger quantities and much more rapidly than traditional data. For example, a single jet engine is capable of generating 10 terabytes of data in just 30 minutes, and there are more than 25,000 airline flights each day. Even though “tweets” are limited to 140 characters each, Twitter generates over 8 terabytes of data daily. According to the International Data Center (IDC) technology research firm, data is more than doubling every two years, so the amount of data available to organizations is skyrocketing.

Businesses are interested in big data because it can reveal more patterns and interesting anomalies than smaller data sets, with the potential to provide new insights into customer behavior, weather patterns, financial market activity, or other phenomena. However, to derive business value from this data, organizations need new technologies and tools capable of managing and analyzing non-traditional data along with traditional enterprise data (e.g.HR data).

Judging by the above, the need for truly unique technologies and analytics methods in relation to big Data becomes quite evident. Wikipedia further elaborates on Big Data to state that: “Relational database management systems, desktop statistics and visualization packages often have difficulty handling Big Data. The work instead requires “massively parallel software running on tens, hundreds, or even thousands of servers”. Thus, for some organizations, the realm of Big Data is only realised after data size exceeds tens or hundreds of terabytes – and even then, only in relation to specific applications. Add to that the rapid input velocity and variety (3 Vs) expected of such data – coupled to the need for real-time processing and analysis – and you are far removed from your typical everyday HR data!

Where Should We be Heading?

Don’t get me wrong; I’m all for Big Data. I appreciate, more than anyone, the potential impact and influence of Big Data on the world today. Just not within the realm of HR, today! A massive, massive digital transformation is needed to refurbish HR with the infrastructure it needs to carry out Big Data operations routinely and effectively … and I do not just mean equipment!

What Are We Doing?

In HR’s inevitable transformation into the digital domain, most impact seems to be coming from the direction of Data Analytics, Visualisation, Cloud Computing, Collaboration, Mobile Technology, Social Media and, to a lesser extent, Wearable Technology. A possible contender to Big Data within the HR sphere is Long Data. Long Data typically refers to data collected over increasingly “long” periods of time with relatively wider frequency and narrower data points than is typical of Big Data. From a standard analytics viewpoint, richer insight is gained the further away we move from typical descriptive analysis towards more predictive engagements. Skills developmentcultural & behavioural changerecruitment ROIemployee engagementattrition and retention would probably be among the classical examples requiring a longer term view of data collection and analysis with quite narrow data points and relatively wide collection frequency.

What Should We be Doing?

A word of cautionary advice, however; analytics within HR has historically been limited to examples similar to the above. Examples that attract NO real concerns outside the HR organisation. When it comes to Data Analytics, Data Culture and Data-driven outcomes, HR does not enjoy the strongest of credibilities within most organisations. I have only to look around me within the wider Middle-Eastern context to confirm my notions. Surprisingly enough, the HR terrain is not far removed as you move further west! If HR is ever to improve its organisational credibility and image to levels typically enjoyed by the more data-savvy, data-conscious and data-driven business units like Project Management, Design, Engineering, Sales, Marketing, Finance, …etc., a wider and more focused effort is needed in applying analytics to more pressing business-related needs and concerns.

In Conclusion …

In conclusion, it is my strong belief that we, as an HR community, have a lot of catching-up to do when it comes to Data and the Data Value Chain. Let us first learn to walk before we attempt to fly.

Let’s not talk about Big Data, just yet! Let’s think Long Data.

