For HR professionals, the melding of data science and HR practice is a relatively recent thing.
It’s a complex relationship – HR professionals aren’t necessarily data scientists, and vice versa. Nevertheless, pursuing data-based decisions in HR makes a lot of sense for the business that wants to act on real information. Overall, businesses are increasingly reliant on HR departments being able to provide insights and make recommendations based on the data that they gather.
One of the things that HR analytics requires is use of the right tools. You need to be able to mine data, interpret it, and make recommendations based on what you find. For this reason, many HR professionals have turned to Excel spreadsheets. They can be set up with formulae and different analytical tools to process data.
However, Excel has limitations – these become especially apparent if you’re dealing with large amounts of data. R offers more options for HR professionals, which we’ll outline here.
Why use R over Excel?
In my own experience, R has allowed me to “play with data” in ways I couldn’t with Excel, so yes, it’s a personal preference. However, a quick look around at the blogs of other data analysts shows that R is the popular choice most of the time. (Yes, you could look at programs like Python or SQL, but R tends to win on simplicity).
Here are some reasons you should consider learning R:
R handles very large datasets
Excel is limited by the number of rows and columns available on each sheet. If you run out, you either open a new tab or a whole new file.
HR data grows over time, so there are plenty of examples where you might run out of room on an Excel spreadsheet. Besides that, the columns naturally limit the number of datasets you can bring in to a finite number.
There are all sorts of analyses you can perform if you have enough data sets at your disposal, and R gives you this flexibility. For HR professionals, this means that R gives you more options to look at and process data.
In a similar vein, when you use the maximum space available to you on Excel, the penalty you pay is often in very slow performance. It’s not uncommon for sheets with 100,000 or more rows to take 15 minutes or longer to open. On the other hand, R will run in less than 30 seconds with 1 million “rows” in place. You’re not sacrificing speed for more data – R will handle large datasets and keep running efficiently.
R has better data manipulation capabilities
To begin with, where does your data come from? As an HR professional, you probably have several different sources and need to bring that data together somehow. In Excel, you can spend a lot of time just on downloading and importing data.
On the other hand, R can bring in data automatically with the input of a line of code. Sure, you might experience a learning curve as you figure out the coding, but data is much faster to come by once you get it.
When you have the data there, R is light years ahead of Excel when it comes to automation and calculation. If you want to get down to fine details, R can handle a lot of analysis and may even identify trends you haven’t considered looking for. It can help you clean and organize data, even determine its statistical viability. As an added bonus, it can read any type of data.
Excel is great if you have smaller amounts of data – it’s simple to point and click at numbers and create pivot tables. It falls down when you start to get more complex (and no one enjoys dealing with Excel when it crashes from the weight of a lot of data!).
R allows HR professionals more robust data manipulations options Click To TweetTime to build
One of the hesitations HR professionals often have over R is that it takes some commitment to learning the coding in order to master it. It is a whole new language and some people will definitely find it more challenging than others.
However, once you’ve mastered a few basics of R, you’ll find that almost anything you build in Excel can be executed much faster with R. This is because the source code of R is reproducible. R source codes can be used repeatedly and with very different datasets in ways that Excel formulas and VBA source codes cannot. This makes it easier than VBA in the long-run, and worth the commitment to learning.
R allows for any type of statistical analysis
When you look at the data analysis that Excel is capable of, I’d class it as “basic to intermediate.” Running third-party macros in Excel hasn’t really caught on – largely due to security concerns.
On the other hand, R is an open source program and has a huge community behind it. This has lead to some very sophisticated libraries for statistical analysis, covering basically anything you can think of. R promotes this sharing of libraries, giving access to new functions that may be applicable to your data.
You could argue that Excel has VBA which will allow you to do most of the things that R can, but the difference is in how time-consuming it is. R allows you to copy and paste code for easy reproduction, whereas in VBA, you will spend a long time setting it up each time.
R has better visualisation tools
The graphical capabilities of R far exceed Excel. While Excel is great for simple charts that you might want to quickly throw together for a presentation, R gives you the option of much more complex visualisations.
With ggplot2 in R, you can quickly create any type of plot you need and customize any aspect of it. For example, R allows you to create a scatterplot matrix, CDF plots and other more complex data visualisations. If you want to really highlight your big data, perhaps in a published report, R gives you the opportunity to create a much more impressive display.
R is more transparent
There are a lot of features that I feel make R a better choice, but perhaps one of the most prominent is its transparency. Everything you do during analysis in R, from deleting outliers to how you interpret results, is contained within the code. The code is presented in a linear fashion, allowing for comments as well making it even easier to interpret. This linear presentation also makes it easy to quickly read – it is read and interpreted like any other program.
If you were to open up an Excel spreadsheet of some complexity, you will not easily be able to figure out what is going on or the thought process behind it. On top of that, values on a spreadsheet can be changed with no record of that change.
Excel spreadsheets can also be a minefield of hidden macros and formulae. The person who created it might know exactly what’s behind it, but a complex sheet can’t easily be shared with and interpreted by someone else.
R is free
Who doesn’t like free? While the Microsoft Office package containing Excel isn’t terribly expensive, it is less flexible than R, which is free to download.
R gives you the option of adding all sorts of features, whereas in Excel, you will have to wait for updates if the feature isn’t available. On top of that, R is supported by more platforms than Excel, so it has more universal application.
Final thoughts
HR professionals – if you’re still trying to use Excel to analyse and process large data sets, there is a better way. R gives you access to better capabilities and easier management of those large amounts of data.
Sure, you might still use Excel for smaller computations, but one thing I’ve found is that once you try R, you’re unlikely to go back. The large R community brings constant improvements relevant to HR professionals – there are even programs that allow for machine learning applications.
With the increased expectations that HR will contribute to decision-making by providing accurate insights, R gives you the opportunity to go deeper into the data.