You will all have read the excellent recent article from Patrick Coolen which among many other interesting points stresses the importance of data validation and the articles from iNostix about the HR Analytics Value pyramids. I would like to contribute on this subject with examples from my personal experience on what it means in practical terms to do data validation and data cleaning. I regard these activities as the foundations of the HR Analytics pyramids. Full compliance and discipline of staff are very important foundations in HR Analytics, too, but since these cannot be taken for granted, data validation and data cleaning are therefore always required.
In my workplace several different colleagues and managers run independently from each other reports in SAP HR and take actions based on report results, but what it holds all together is the not very exciting data validation and data cleaning. It suffices that something is not checked and that wrong decisions are taken, that immediately I get asked why I did not check this other aspect. Sincerely I cannot possibly check everything that is going on in SAP HR. Data validation and data cleaning is therefore prioritised as follows:
1. Issues which have an immediate impact on users
My first priority of checks are issues, which have an immediate impact on users, which in my case are all the staff members. I have opportunistically to give priority to these, since if the users cannot do something, it is again my team and myself, who have to answer and fix the query. By doing these ex-ante checks I keep the issue log down.
All our users have to declare their working time and their specific activities they worked on in their employee self service portal called CATS (Cross application timesheet). Users apply for absences online, except sick leave is entered from the backend due to complicated rules to be respected. Absences are transferred automatically into CATS via a custom built process. SAP HR built-in integrity checks rightly does not allow for certain things, when master data are not maintained properly (especially infotype 0315, i.e. time sheet defaults) or when users involuntarily contradict themselves (for example an absence cannot be at the same time also working time).
2. Data entry by backend users for training purposes
The second priority of checks are SAP HR data entries done by backend users. I am doing this for training reasons. Basically I am checking that the data entry is done as per the established procedures. Sometimes I might have to revise the procedure. Readers might think that SAP HR does it for you, but since I am talking here about SAP areas configured ad hoc and some custom built, I have to check them myself nearly manually, but at least I manage this with good Excel skills. In SAP HR there are the so-called personnel actions, which are basically a guided run through the various infotypes that the backend users have to maintain, for example when hiring somebody into the system several HR data need to be entered. Backend users need to remain consistent until completion of the personnel action and execute the action according to the user guide.
3. Preservation of reliability of HR services
The third priority of checks is anything related to the services my HR department provides. I need to preserve the good image of HR. I check career statements and quality of HR data themselves in view of administrative decisions based on correct HR data, for example in view of step increases, promotions and so on.
Once I have done all these checks above I can dedicate myself to HR Analytics, which in my case means trying to predict the workload distribution among the various divisions and departments based on future absences and requests for part-time, the working hours declared in the past, the staff allocated and the various business related work load indicators.