Written by our guest blogger: Lyndon Sundmark, MBA People (HR) Analytics Consultant / Data Scientist
Introduction
If you have had a chance to read Dr. John Sullivan’s article:
http://www.tlnt.com/2013/02/26/how-google-is-using-people-analytics-to-completely-reinvent-hr/
It seems to suggest that re-invention of HR is strongly related to the use of People Analytics. This in turn seems to be strongly characterized by being ‘data driven’.
But what does it mean to be data driven? How can we recognize it when we see it?
Part of the problem, when any field or discipline is new or newly recognized, is that terminology is often not yet standardized and completely agreed upon. As a result, terminology can result in vastly different meanings to people.
Consider the following few examples:
- Some might think that if their HR software gives them automated reminders- reminding you to do something before a deadline- that they are data driven.
- Some might think they are ‘data driven’ because their HR software provides them with HR Activity metrics /scorecards and the ability to display them to you graphically.
- Other companies might think that data driven HR occurs when their future decisions are actually assisted and justified by data
Different perspectives to say the least. Again, it is likely due to the terminology being relatively new. The end result might sometimes be confusion and lack of clarity.
With that in mind, in this article, I offer my two cents worth of what data driven HR could look like. I say ‘could’ because my voice on this may be one among many as the terminology settles down and becomes more standardized and widely accepted. Having said that, I think there is enough already being written about this to start suggesting some basics that many people might be able to agree upon. These basics might be able to assist as a road map to move organizations more into the data driven direction and be seen as such by others.
In offering my two cents worth, I would like to suggest:
- Some criteria that might help more clearly identify what it means to be ‘data driven’, given the current definitions around Workforce, HR, and People analytics and perhaps other HR spheres of activity that can result in metrics.
- A possible framework which would include
- a lay of the land, with respect to what content areas may represent a significant part of the HR data picture for being data driven
- Understanding the importance of appropriate analytical techniques being applied.
- Some brief examples, for a selected number of HR functions, on how to make them far more data driven. (These will be illustrated at a relatively high conceptual level. I hope to provide significant detailed examples of some of these in future articles)
Because I am one voice among many, I leave it up to the reader as to the value or usefulness of this picture.
Criteria for ‘Data Driven’
Why is it important to think about some criteria which may be useful in helping us recognizing ‘data driven’ HR when we see it?
As mentioned above, I think this area is still in huge flux and fluidity in terms of terminology- because it is a very recently recognized area of HR by any ones definition. Time, HR community dialogue, usage and experimentation, and evolution of this area I think will eventually help this area to settle itself down with respect to much of its terminology.
Given the fluidity, and the fact that my terminology, my assumptions, my understanding may be different from yours, it’s helpful to start with some criteria. If you understand and accept the criteria based on your own experience, then presenting at least one picture of what data driven HR looks like- can at least be understood as being consistent with the criteria.
What might be one set of criteria that might be a useful characterization of data driven HR?
- Data driven HR does not mean we completely replace human judgment. But equally important we don’t ignore the data and analysis if it’s there. Instead, we supplement human judgment with HR data analysis before we make decisions. Even when we use analytics extensively, human judgment in the analysis process itself if often necessary. We see an outcome or insight in the data which begs further questions and further analysis of data. This is the human judgment part.
- Data driven means exactly that-‘data driven’. If we are serious about being ‘data driven’, we need to build up the analytical capacity and data management capacity in our organizations in a way that allows HR decisions to be based on data. The human resource decisions we make should be able to show evidence of the data, analysis and reasoning behind the decisions we make. Many traditional HR ERP systems provide a lot of HR transactional data, but that is not the entire picture. These ERP systems do very little in helping you understand how to run your business of HR better. And they do very little, in actually helping you make better HR decisions in the actual conducting of HR activities themselves. Finally, often in their default form, the data is not in the form necessary to be ‘data driven’. Often transformation of the data is necessary and appropriate analysis is critical.
- Data driven HR should mean that all HR functions are ultimately candidates for being ‘seen’ as being measureable and being capable of being ‘data driven’. This means that we not only understand our HR field based on the methodologies we use, but we go a step further and understand it fully at a data, information, and measurement level. The best way to start that is to understand every HR function as a process. We tend to think of recruitment, job classification, training as functions. But all of them are really processes that have inputs, activities, and outputs. As such, can we look at these familiar areas and understand for the first time that information (possibly in paper form currently) gets generated in these processes at each step. In other words can we understand a view of our HR functions for the first time as being processes that generate information that can be measured- so that we see them from an informational dimension and a measurement dimension. You really can’t be data driven in your HR operations if you are not seeing or understanding your HR functions as processes that generate information and things that you can as a result measure.
- Data driven means you don’t have HR metrics and measurements simply for the sake of having measurements and metrics for ‘show’. If you are producing HR metrics simply for monthly management reports that get reviewed and never acted upon- WHAT is the point? Even when very powerful business intelligence tools are used to produce HR metrics, many implementations of HR BI still suffer on this criteria. Data is processed, tables and charts are produced and NO action on what is presented is often ever taken. BI tools allow us to produce information and graphics in an automated way even faster- and still result in no action taken. Data driven should mean there is a stated purpose for everything that we calculate/produce and if that metric shows us that we need to take action- then action is taken. And that it’s the correct action.
- Data driven means ‘hands on’. Period. HR business intelligence tools can be useful in producing the reports and summaries that may alert organizations that there is something that requires action. Taking targeted, precise, surgical corrective action however requires ‘hands on’ further data analysis. Analytics isn’t just about the ‘metrics’. It’s about the analysis of those metrics. Many vendor prepackaged solutions out there can promise and deliver pre-prepared metrics for you. BUT, can you export out data from their solutions for further analysis outside of their solution? If you can’t, how do you take targeted decisive action? I tend to have a bias against prepackaged metrics solutions for exactly that reason. Far better that you are in full control of the data being used, that it can be exported out and that you are able to choose the analytical tools used on the data. Data driven means ‘your’ hands on analysis, ‘your’ data analysis skills, ‘your’ data analysts with the tools- you can’t delegate that to a vendor.
- Data driven means –statistics and statistical analysis. It was mentioned just a little above that producing information for which there is no action taken when action should be deemed as relatively pointless. But what is equally pointless is taking the wrong action or taking action when action shouldn’t be taken at all. One of the issues we face is that, with the technology that exists, there is more data out there than we can possibly analyze. And even when we take a manageable portion of that for a specific purpose, we sometimes can’t see the forest for the trees. From a statistical perspective, we are generally wondering whether there are differences between things in our HR measures, and if so whether they are statistically significant. If they aren’t, we conclude no real differences and often no action needed. If significant, it may suggest action. You often can’t tell whether action is warranted outside of first doing statistical analysis. And if you choose to take action where there are no statistically significant differences-what is the appropriateness of that? You might be trying to solve a problem where there isn’t one. Or solving a problem where there is one- but with a scope that is too wide. Or you might be applying the wrong action to the situation. Statistical analysis is necessary to find the buried gems, or the needle in the hay stack.
- Data driven will likely include building models (in this case statistical or data models).Not all HR questions of data require building of models but many do. If you agree that statistical analysis is a necessary criteria, then what you are actually doing is building either descriptive or predictive models to help you make the best decision based on what the data is suggesting. If for example you are trying to predict whether a candidate will be a successful hire or not, and your past hiring has resulted in about 80 percent successful hires without being data driven, if you build statistical models that end up helping you achieve 96% successful hires in terms of prediction and the actual results that occurred- that might be of significant value to you. One of the major purposes of statistics is to build models that help us make much better decisions than in the absence of them.
- Data driven is about the ‘analysis’ and the ‘analyst’. Others are writing about this currently in various articles and blogs as well. Software is a critical part of the picture, but it’s less about the specific tool, and more what the analyst can do with it to answer the questions asked. In other words you can have the right tools, but if you don’t have people that can use them, you are still stuck.
There may be many other criteria as well, but these are the significant ones that come to my mind.
A Possible Framework
So if we accept the previous mentioned criteria as being a set that are reasonable and useable for recognizing data driven HR when we see it , it may raise some of the following questions:
- How do we figure out where we are in this data driven HR picture?
- Where are we exactly?
- How far along are we?
- How do we compare?
- What is our maturity in this area?
- What do we still need to do?
At least two major characteristics come to my mind for a framework:
- Data Content Coverage –what is relevant for data driven purposes and why
- Evidence of appropriate analytics on the data content to address key HR decisions and questions
Data Content Coverage
The ‘what we can gather information on and measure in HR’ is vast. And this isn’t restricted to what is in HR ERP systems but also anything else we might choose to gather and store locally. So how do we get our heads around this?
Aside from the fact that whatever we gather and measure should have a purpose (from our criteria above), much of what we could measure could come under at least 3 major categories:
- Measures related to what is going on with human resources in the organization – human resource activity and where applicable their tie to organizational outcomes.
- Measures related to how well HR processes are performing
- Measures related to HR methodologies
These don’t necessarily represent mutually exclusive categories on the basis of the content of metrics within them. But they might be able to be seen as relatively exclusive categories as for ‘reasons’ why we capture or record the metric.
I think this notion is a really important one with respect to HR metrics and data driven HR because at all times what should be at the forefront of our minds here is WHY we are recording the data, and calculating and analyzing the metric. I would like to elaborate a bit more on each of these to clarify.
HR Activity and Business Outcomes
The best way to describe this category of measures would be these are the measures you would typically find as being related to whatever represents activity generated by HR in all of its various functions and where applicable , the appropriate ties to business outcomes. It gets at what is happening to the human resources in the organization- hires, terminates movements, number our courses taken, number of incidents of injury, absenteeism, employee counts, number of grievances, number of job classification requests, HR costs per FTE etc. Some good examples of these would be many that you find on SHRM’s site in the USA:
http://www.shrm.org/templatestools/samples/metrics/pages/default.aspx
And on the HRMA site in Canada:
http://www.hrma.ca/wp-content/uploads/2012/05/HR-Metrics-Standards-and-Glossary-v7.3.pdf
In any case, the idea here is that all HR functions tend to generate activity which in some fashion can be reported on- definitely qualitatively and more often than not quantitatively as well. When you look on the internet and in organizations, much of what is titled as HR metrics are often likely to be in this category. Historically, these types of metrics would typically have been presented in summary tables and charts in management reporting. So these type of metrics have been around for quite some time whether they were recognized or not as ‘HR metrics’ as they were being produced. This can be a very extensive category of useful HR metrics and measures- and also often a good starting place for our review of the lay of the land of content areas. The key question becomes ‘what is the difference between the provision of these to management as management reports’ as compared to ‘an organization being truly data driven’ with respect to their HR information? (Hint: analysis and action where appropriate)
HR Processes
This category is concerned with measuring the performance of the HR department or function itself. It differs from the last one – in that- the last one pays attention to the entirety of human resources- the employees in the organization- not the HR function or department itself. In this we are measuring HR operations.
There is a book that I came across recently, ‘Achieving HR Excellence Through Six Sigma’ by Daniel Bloom that helps one to understand the lay of the land in this area. It’s not so much a ‘metrics’ book per se, but rather the applicability of quality improvement initiatives to HR as a means of transforming HR. Quality improvement initiatives require measures and metrics. Another significant book on the topic of Six Sigma (not specific to HR) is Thomas Pyzdek’s and Paul Keller’s book –Six Sigma Handbook.
The point here is that if organizations choose to transform their HR functions through quality improvement, metrics are an extensive part of that picture.
Every function in HR is a ‘service’ being provided. HR’s business is to provide HR services to the organization. As I mentioned in one of my previous blog articles, this is really seeing the provision of HR service in the Supplier->Input->Process->Output->Customer (SIPOC) model. The output is the service (i.e. recruitment services, labor relations services, job classification services, employee services etc.). It is provided to customers. The service is provided by an organizational process that exists for that purpose. Organizational resources are required as inputs for the process to carry out its activity. And suppliers generally exist to provide those inputs.
In this category of measures, anything that measures who the customer is, their satisfaction, volume and quality of output, process activity and quality, quality of input, or the suppliers would be a candidate for this category. Some organizations have these types of measures and metrics, many or most do not. The most likely evidence of whether these metrics exist or not, is whether the application of quality improvement initiatives (i.e. Six Sigma as one example) to HR has occurred or not. If they have, there are many measures here that are candidates for being data driven.
HR Methodologies
When I think of this category of measures or data, I am thinking of data that is very specific and intrinsic to the purpose of the specific HR function itself.
For example, in Recruiting, measures would typical include:
- whether a person was a successful hire or not
- best candidate at time of hire
- age (after a person has been hired)
- education level
- years of experience
In a recruiting example, if the purpose is to ‘predict’ a successful hire, any data we gather as part of the recruiting process to either determine best candidate at time of hire or successful employee after hire and whether they were ultimately would be in this category. It is the information very specific to HR functions themselves.
In a training and development example, it might be ‘before and after’ measures of skills or competencies that the training is intended to address.
In a classification example, it might be information surmised from the job description and information about the job classification assigned to it.
The idea here is that it is the information gathered in that HR function necessary to make the decision which that function is designed or purposed for.
These measures say nothing about how well the function is working, and do not necessarily record human resources (employee) activity per se. In most organizations that I have been in, this category is rarely in evidence in terms of seeing it as measures that are quantifiable. A lot of this is due to the fact that, per the discussion above, HR professionals often don’t see the informational and data implications of the processes they carry out. Seeing HR as a ‘soft’ non-technical field doesn’t help this either.
Evidence of Appropriate Analytics -What We Are Doing With the Data
In the criteria mentioned above, describing what it means to be data driven, it was suggested that statistics, statistical analysis, and modelling will likely need to be part of the picture. Why is this?
Whether it’s the application to HR data or any other domain of data, the purpose of statistical analysis and modelling in to help us to understand the world around us, in some cases predict an outcome before it occurs or needs to occur. In other circumstances it might mean minimizing or maximizing something that we don’t or do want to occur, and in other circumstances simply understanding the relationships or lack thereof between various HR indicators. In ALL of these situations- we are collecting for a purpose.
When we gather HR data for ‘data driven’ purposes, some of the information will be HR outcomes- a target or ‘dependent’ variable in statistical terms, and many might be either influencers or predictors of the ‘outcome’. These influencers and predictors are often seen or described in statistical terminology as ‘independent’ variables. The whole point here is that we are trying to improve the outcomes beyond just human judgment alone. This is consistent with the tone mentioned in the google link provided at the beginning of this article:
“The basic premise of the “people analytics” approach is that accurate people management decisions are the most important and impactful decisions that a firm can make. You simply can’t produce superior business results unless your managers are making accurate people management decisions.”
When we are ‘data driven’ we seek to not only gather relevant data, but process it appropriately, based on the questions we are asking of the data. It’s applying the right statistical tool or analysis to answer the question asked in the first place. In doing that, we are often building models along the way that help us improve our decision making beyond just human judgment alone.
So in summary, the pillars of a framework for being data driven could include understanding:
- the HR questions we are asking
- the totality of the relevant HR ‘data’ domain
- the statistical tests and models we are applying and why
Not only that, having a framework can help us understand where we are as an organization in this fluid field of HR, but also our maturity level within it. At least one view of organizational maturity here would be:
- how much of the HR data driven domain of data are we actually collecting
- how much of that data collected is actually being analyzed, and used in statistical models to answer critical HR /People related questions
- how much has our HR decision making in the organization improved as a result of being data driven
- how much has organization performance improved as a result of data driven HR decision making
‘Data driven’ organizational maturity has little to do with one technique or model being in evidence over another, and more to do with right tools and data to address the questions being asked in the first place. Predictive, descriptive, correlational, and quality improvement techniques are all examples of types of tools and analyses we can do. My take is that the level of maturity in being ‘data driven’ is that it is a function of how often are you asking HR questions of your data, how much of the HR data lay of the land are you covering content wise, and to what degree you are applying the right tool to the data to the question you are asking.
Some Possible Examples of Data Driven HR
If we put together much of what has been described above, one might conclude that ‘data driven’ HR becomes a reality when we
- Collect HR data for a purpose (that purpose being to answer important HR questions or make better HR decisions). What are the questions we are asking in the first place?
- choose appropriate analytical tools ( typically statistical tools) to help understand much more clearly what the data is telling us ( in answer to those questions)
- use those results in actually taking action or not based on what the data is telling us
What are some examples of this from a high level conceptual perspective? Here might be a few illustrative examples.
HR Activity
As mentioned previously, these types of metrics tend to reflect what is going on with the employees in organizations. Since ‘data driven’ should mean our decisions are based on what the data is telling us, this in turn is based on what questions we are asking. Let’s look at a few examples
Turnover
Generally we are interested in turnover because of what turnover costs us in terms of knowledge that walks out the door, or cost of replacement or both. Turnover as a HR Activity metric is therefore critical. Taking action on it is even more important. Data driven here can mean both descriptive and predictive analytics- predictive possibly for impact on our HR employee inventory of the future, descriptive in understanding the nature of our problem or issue at present time.
Example Key Questions for Data to Answer
- What is our level of turnover?- Our organizational metric seen present moment and over time
- How does it compare to other organizations?
- Is there a level of turnover that will likely occur regardless of what the organization does?
- Where is it occurring? Could be sliced and diced by where you find it in the organization, age groups, gender, length of service group just as a few examples
- Is the particular slice and dice significant? – statistical tests such as analysis of variance
- Does reduction of it require overall action or targeted action? Results from statistical tests
- Creation and provision of metric
- Benchmarking of our metric
- Analysis of the metric by much of the categorical and quantitative data we have on our employees. I.e. does this metric vary by these qualitative and quantitative measures? If it varies are the results statistically significant? Is there any interaction between these measures and the resulting turnover?
- Statistical tools- correlational analysis, analysis of variance, linear regression depending on questions asked
Analytic Approaches
Absenteeism
This is another good illustrative example of activity related HR metrics. It represents a significant cost to organizations because people who should be at work working aren’t. There is probably a level of absenteeism that would represent ‘true’ how often people are off of work really sick –for which organizations would be unlikely be able to go below no matter how much attention and effort given to minimize it. So other than recognizing that, the general intent with this metric is to minimize absenteeism period.
Key Questions for Data to Answer
- What is the level of absenteeism in our organization?
- Is there a natural level of absenteeism that is likely to occur based on the fact that everyone will likely be sick during the year sometime? If so what might that level be?
- Where is it occurring?
- How does it vary with other categorical and quantitative measures we have on employees?
- Is this variance statistically significant?
- Does reduction of it require overall action or targeted action?
- Creation and provision of metric
- Analysis of the metric by much of the categorical and quantitative data we have on our employees. I.e. does this metric vary by these qualitative and quantitative measures? If it varies are the results statistically significant? Is there any interaction between these measures and the resulting turnover?
- Statistical tools- correlational analysis, analysis of variance, linear regression depending on questions asked
- Benchmarking not likely relevant here- short of being at whatever level might a natural minimum, any absenteeism in the organization beyond that costs unnecessary money. Comparing to others is irrelevant.
Analytic Approaches
HR Processes
As mentioned previously, this area of HR metrics and measurements isn’t concerned with employee activity in the organization, but rather of the operation of our HR processes themselves. My own experience is that these are often only in evidence, if either an external quality improvement intervention as occurred on an HR process, or that HR itself has embraced quality improvement as a philosophy and methodology for how they conduct HR.A good couple of examples that come to mind are recruiting processes and job classification processes
Key Questions for Data to Answer
- How effective and efficient are our processes?
- How satisfied are our customers?
- How long do our processes take to deliver a service from time of request to completion? How can we minimize that?
- How long do various steps in the process take?
- Are our processes in control, or out of control statistically?
- What are the value added and non-value added steps in the process? How can we get rid of the non-value added steps while still allowing for process to deliver the service to the customer’s satisfaction?
- Generate overall length of time data for service by having all service requests to HR tracked in service request types of ticketing systems that log when the request came in, when work on if began, and when it was completed. Identify measures of customer satisfaction- might include capturing feedback from client on how HR did in performance of their request.
- Improve such HR tracking systems to allow capturing of steps and their start and end times and dates.
- Quality improvement methodologies (i.e. Six Sigma as example) including their analytics tools such as run charts, fishbone diagrams and correlational analyses, scatter plots.
- Taking action based on the results of those methodologies and continue to monitor for improvement.
Analytic approaches
HR Methodologies
As mentioned above these are metrics and measures related to how we actually do what we do in our various HR functions. The intent with these measures is that it isn’t a byproduct of HR activity but infused directly into how do what we do such that the end result and technique cannot do what it does without it. In my experience very few organizations incorporate metrics of this type and down to this level of how we ‘do’ HR. And yet if the issue here is ‘reinventing HR’, how can we reinvent without this? I will give a couple of illustrative examples of this in recruiting and job classification.
Recruitment
Key Questions for Data to Answer
- Who is the best candidate based on what we know at time of hire?
- Who might end up being and involuntary terminate down the road based on what we know at time of hire?
- Who might end up being a voluntary terminate in the short term, where we would desire employees that usually provide us either with long service, or at least a minimum of two years to make our hiring investment worthwhile?
- Where are our best sources of candidates? How do we know this?
- What is best approach to advertising job opportunities?
- What pieces of information that we either collect or can collect evidence themselves as being good predictors of successful hire?
- Are there any relationships between these predictors?
- Develop measures to collect as data on during the recruiting process on candidates( could be length of service with previous organizations, number of years of total experience, number of years of directly relevant experience, education level etc.), whether they were offered a job or not, and if they were, length of service till termination, and whether termination was voluntary or involuntary)
- Utilize statistical approaches such as linear regression where what you are trying to predict as success is a quantitative measure, and possibly decision trees and linear discriminant analysis when success criteria is categorical, to try to improve the batting average beyond human judgment alone
Analytic Approaches
Job Classification
Key Questions for Data to Answer
- Given a series of job families and job levels (i.e. job classes) and a new job and job description that requires proper classification –how does one slot this new job description into the appropriate job classification and especially level, IF that description doesn’t exactly match the description of the job families and levels given? More often than not job descriptions themselves will likely straddle the characteristics of more than one job classification description. How do we choose? How do we ensure that our decisions are consistent and fair?
Analytic Approaches
- Quantify information that exists in the job description AND job class descriptions in some way. This could include education level required, experience level required, organizational impact, problem solving ability required and job grade etc.
- Conduct decision tree analysis or discriminant analysis as example approaches on this data to understand how these features in job descriptions characterize the job classification description groups themselves, and how well these features predict the job grade group.
All of the examples above are intended to be ‘illustrative only’ of the fact that being more data driven in our HR decision making is only limited by our imagination and the questions that we decide are important to ask.
Conclusion
My primary intent in this blog article has been to assist, if I can, in providing some clarity in understanding of ‘data driven HR’ through:
- Suggestions of criteria
- A possible framework what the lay of the land in terms of coverage
- Examples of some of the analytical approaches.
Anything I have provided above is a miniscule portion of the picture, and once again is therefore illustrative only and not intended to be exhaustive.
What are some takeaways?
- Data driven is ‘new’. The ability to do this is not. The gathering of HR data, the ability to put it in computerized datasets, the means to apply statistical software to those datasets, and to assist in HR decision making is not new. The means to do all of this has been around for over 35 years in one form or another. The recognition of this by HR and the actual doing of it IS new, and is one of the reasons for the emergence now of People Analytics and data driven HR.
- ‘Data driven’ is not simply about the metrics themselves and their preparation. If we are going to be ‘data driven’ it is about the actual usage of these. It means taking action where action is suggested, and not taking action when it isn’t suggested. And when we do take action that it is the right action as evidenced and supported by the data. There is no ‘driven’ without this. Without ‘driven’, we are simply generating more data that no one will use.
- ‘Data driven’ is definitely ‘hands on’. If we accept the premise that ‘data driven’ means that we ask questions related to HR decisions that need to be made , determine and/or collect the data needed to answer the question, and then conduct appropriate statistical analyses to answer the question- this is unlikely to be able to be done automagically. Analytics software with prepackaged analytics may help in the preparation of the metrics. But if we can’t interact with that and perform additional statistical tests to answer other questions, we are artificially limited. HR needs to understand this dynamic and ensure it has those features in its solutions or data stores, and that HR employees have the HR domain, IT and statistical analysis skills to be able to do ‘hands on’.
- Being data driven has the potential to pull together a number of areas out there into directly assisting HR to reinvent itself:
- People, Workforce, and HR analytics metrics
- Quality improvement and HR ( i.e. Six Sigma)
- Statistical methods and models directly applied to HR methodologies
The potential for HR to reinvent itself through being data driven is huge. Each organization needs to determine how much of a priority it will be in their circumstances.