EDITOR’S NOTE: This article is about how to approach and think about leveraging data to make human resources more inclusive across organizations. True Interaction built SYNAPTIK, our Data Management, Analytics, and Data Science Simulation Platform, specifically to make it easy to aggregate siloed data (i.e. from Human Resources, Finance, etc.) for more meaningful data discovery.
We know that Machine Learning (ML) and Artificial Intelligence (AI) will transform the future of work itself, but how will it affect the processes by which organizations choose, develop and retain their workers?
Katherine Ullman is a data scientist at Paradigm, a strategy consulting firm using social science to make companies more inclusive. Paradigm has partnered with a range of clients, including Airbnb, Pinterest, Asana, and Slack, among others. Katherine and I had a recent discussion about how her organization works with data and machine learning to assist clients in better understanding the impact of their people processes including recruitment, selection, training and retention of underrepresented groups.
(NOTE: The word “impute” below is a technical term that refers to the assigning of a value by inference.)
TI: Being a data scientist at an human resources & strategy consultancy that leverages social science to make companies more inclusive sounds like an amazing job. Can you provide some insight into your daily work and the work you do with clients?
“One of the core services I work on as a data scientist is our comprehensive Diversity & Inclusion Assessment. (The assessment) is a multi-month project designed to identify barriers and design client-specific strategies for diversity and inclusion. We do that by collecting and analyzing both quantitative and qualitative data about the client’s people processes and the outcomes of those processes.”
“The first thing I am doing is understanding, cleaning, and linking that data…that’s a surprisingly large part of the process. Once we clean it, we analyze the data to understand how an organization attracts, selects, develops and retains its workforce with a particular focus on underrepresented groups. What we’re looking for is how important, people-related outcomes in an organization – things like who is hired, who is promoted, and who stays or leaves – might vary depending on your identity. At the same time, our consultants are doing the qualitative research, and then we synthesize those findings internally.”
“(Then) our learnings from (the data) shape the strategic recommendations that we offer to clients to, again, improve how they attract, select, develop and retain all employees, and particularly those from underrepresented groups. Clients often come to us with a sense that they have room to improve with respect to diversity and inclusion, but don’t know where to focus their attention or what to do once they determine that focus. Our analyses provide insight not only into where our clients should concentrate their efforts and also provide clarity around what solutions to implement. For example, a client might believe they have a problem attracting a diverse set of applicants, but we find that their applicant pool is relatively diverse and underrepresented people are simply falling out of the funnel early in the hiring process. We might have that client concentrate less on active sourcing, then, and instead focus on ensuring their early stage selection processes are fair.”
TI: How do you wrangle all the diverse, silo-ed data and organize it for your internal analysis purposes?
“Storage is not usually a difficult issue in terms of size, but there are obviously security concerns that we take incredibly seriously, as we deal with sensitive data.”
“R is our main wrangling tool and we use some of the really great open source packages developed by the R community that allow us to create interactive dashboards and crisp visualizations as a means of communicating back insights, both internally to other members of our team and to our clients as well.”
TI: How does machine learning impact your current work? How are you using it?
“We use some ML techniques to impute missing demographic data. It’s often in the applicant data from recruiting software systems where we have the most missing data in terms of demographics. Once we impute the missing data, we are able to ask, for example, ‘Here are people from different demographic groups, how are they entering the application pipeline? Are they coming in through referrals? Through the company’s job website? Or through third party boards or events?’ A lot of the companies we work with actually track this information at a granular level, making it easy to gain insight about who is entering the funnel through what sources, and how successful those sources are.”
TI: How do you see machine learning impacting your work five years from now?
“(Currently, Paradigm) is using existing tools primarily for imputation, and we aren’t pushing the envelope too far. At this stage, I think this is wise. Our work with clients has real outcomes on people and you need to really know and take seriously the implications are of what you are doing when you are using these new and exciting tools.”
“I think we are going to continue to see a lot AI and machine learning move into the HR space. There are examples of companies who are already using this well–like Textio–but I think it’s important to be both optimistic and suspicious about new technologies in this space. Do we want machine learning to make hiring decisions? One might argue that is going to remove bias from the process because it’s removes the need for human judgement, but at the same time you have to wonder, what is the data underlying these models? It is very difficult to find data that links characteristics of people and their employment outcomes that is free from human bias, so any machine learning that built on that data is likely to replicate those issues.”
“But there are reasons to be optimistic about the future of machine learning. For example, I am seeing a lot of work to actually diversify the machine learning industry. The advocacy there is really important because people are starting to understand that who makes thee tools matters a lot.”
TI: What recommendations do you have for organizations who want to use data to understand and improve their current HR practices?
“A lot of (companies) already have recruiting software – applicant tracking systems – because even at small companies recruiting and hiring is such a heavy lift that most people find themselves looking into systems that will help them make that easier.”
“I think where companies can improve is really honoring the recording of data that isn’t auto-populated through every ATS (applicant tracking system). For example, in applicant data, taking the time to record every applicant that comes through a referral and who that referral is. This is really important to understand the success of hiring sources, especially for some of the larger companies we’ve worked with. Even if less than 3% of the applicant pool are referrals, we may end up finding that referrals comprise over 30% of hires. When companies really make sure that everyone getting referred is documented, we can feel confident and clear in our insights. Most companies are collecting this data in some way, but there’s a lot of variance in the quality of data that individuals need to record themselves.”
“I’m really looking forward to the development of more HRIS/ATS systems in this space that will streamline data collection and link various systems (performance, recruiting, internal surveys, etc). Until then, I think the (best) thing to do is to really honor the data collection process with an eye towards making it legible for other people internally or externally to use in the future. This will happen naturally as people analyst positions become more of a norm, but until then, I think people think of the data collection process as just a burden with no end goal. I get that, but if done well, it really gives us (and our clients) the opportunity to gain meaningful and accurate insights.”