Creating New Data Scientists in the Age of Remote Work
Learning how to be a professional data scientist is different now, but it’s not impossible
Today’s column is partly about data science, but it’s also about the sociology of work. As a senior practitioner in the field, I started my data science career long before Covid-19 and the radical shift in the way we work today developed. I started my professional career years before even that. As a result, my years of learning how to be a professional anything, let alone a professional data scientist, were spent in close quarters with much more experienced people, and in many ways that made it possible for me to get where I am. My development into a data scientist was not just about coursework and studying, it was about learning how to be a data scientist in all kinds of ways, many of which were subtly absorbed by being around other data scientists and doing the job.
It’s important to recognize that work isn’t only about what we give to our employers, even under capitalism. It’s also about what we get back, and not only in monetary terms. Workplaces and jobs have many social and cultural impacts on us, beyond just collecting a paycheck. In particular, we develop our social identities through professions, and we learn how to embody those identities by seeing how other people enact them.
We develop our social identities through professions, and we learn how to embody those identities by seeing how other people enact them.
The point I want to make is that turning smart but inexperienced young people into professional data scientists isn’t that much about math skills, but more about social norms, developing networks, and getting acclimated to the context of our work. A lot of these elements are tough to acquire at the best of times, and we are now in a situation where remote and hybrid work are going to require us to come up with new ways for this information and culture transmission to work.
(A lot of what I’m going to talk about can be relevant to many kinds of work, but I’m specifically interested in the experiences of young data scientists here.)
Let’s dig in to some of these things we acquire from work that enable us to become full participants in our profession.
Norms
We develop cultural and social norms from working, by observing the behavior of others, particularly others above us in the hierarchy. Business jargon, norms for attire, and social niceties, for example, can all develop through mostly indirect learning in the workplace. In some workplaces, especially white collar ones, these norms represent tools for building social capital and even class mobility. In data science, there are some norms that are generalizable across the field, or within different industries where data scientists practice. For example, in tech data science, casual dress is certainly the norm. There are also unspoken standards for how you interact with your boss and other people in leadership, including how you communicate about technical topics. There are tons of norms around how to just be a professional in general, too - how to handle business travel, interacting with customers, and so on. These skills are all essential for professional success sooner or later, but in my experience they’re things you learn mostly by observation and osmosis.
Skills
We also gain tangible skills from the workplace, as nearly everyone has to learn something to be successful in a new job. In my case, coming in to data science after some years in academia, I had a lot to learn about how to take what I knew about data science and machine learning and apply it to business problems, rather than academic ones. I learned new algorithms, coding best practices, and many other skills from colleagues in data science jobs I’ve held, particularly my first one. This wasn’t all formal training by any means — a significant amount of it was passive learning by watching and absorbing how other, more experienced and successful folks did things. Relatedly, we learn about our “unknown unknowns”. We all have blind spots, especially when we’re starting out, and we don’t realize we’re missing a piece of the puzzle or an approach that might be useful, until that absence is pointed out to us. Observing colleagues who use a skill you didn’t know existed can open the door for you to build that skill yourself.
Networks
Beyond this, there’s a more abstract but still important element of camaraderie and network creation that workplaces provide. Ideally, when you join a team or a business, you develop relationships with the other people involved, and those connections are the glue of a professional network that can help your career develop in the future. If you don’t make those linkages, you will be disadvantaged not just in your current job, but down the road. I for one am incredibly fortunate to have a strong professional network developed through collegiality that has been instrumental in my career success. Data scientists are great people, and we help each other find opportunities and make connections, but you need a way to break in to those networks when joining the profession. Often this is made easier by having more experienced colleagues who will make introductions and informally vouch for you.
The process of creating a new data scientist successfully involves acquiring all of these components in some way (and probably others besides, depending on the situation). But, as I mentioned, the physical context of work has changed a lot (and for the better, I think) since I started my career. How do we bring new practitioners into the field in this new world?
Where we Work
The way I see it, there are really four ways of working for white collar folks like those of us in data science/machine learning today.
- 100% in office in person, with no one working remotely at all
- Hybrid: Some or mostly remote work with intentional, purposeful facetime
- Hybrid: Some or mostly remote work with sloppy, haphazard facetime
- Fully remote work with zero facetime
As many pundits have argued, something in the hybrid space is probably what most of us data scientists are going to have in coming months and years, if we haven’t already. Fully in-office work is not coming back for most of us, because we have gotten a taste for the autonomy and flexibility of remote work, and realized how much of an improvement this is for our quality of life. Data scientists have a skill set in high enough demand that we can get this flexibility in roles if we want it.
It’s important to spend time on the disambiguation of what “hybrid” means. Commuting into some office park or downtown three days a week is not the only way to do work where people sometimes see each other in person and other times don’t, and it’s frustrating to see how tone deaf the conversation is on this. I’d regard most conceptions of hybrid work as “Some or mostly remote work with sloppy, haphazard facetime”. This is because they’re trying to create hybrid in the mold of pre-Covid workplaces. It’s a function of having very little understanding of what we want work to be, and what the worthwhile tradeoffs are.
Spending time in a mostly empty office with a handful of people you barely know and scarcely interact with is a poor way for a junior data scientist to achieve the benefits I described above, if they would be achieved at all. And the tradeoffs to them, their families, and the community are tremendous. Commuting is terrible for our individual health and social welfare and our environmental health, to say nothing of a waste of valuable time we could be spending in productive ways. If we are going to spend time traveling to a place of work, it had better be worth it.
Commuting is terrible for our individual health and social welfare and our environmental health, to say nothing of a waste of valuable time we could be spending in productive ways.
What’s the alternative? I’m a big fan of “Some or mostly remote work with intentional, purposeful facetime”. An example of this might be day to day remote work with a quarterly on-site, where people travel to a central place (not just a regional or local office), and do things like strategic planning, collaborative work, spend some time socially, and learn from each other. There are myriad other possibilities of how you could split this time, but the point is that it is face time with a purpose, and it’s designed to fulfill that purpose.
Things that may be true of effective hybrid work:
- it may not be cheaper than remote work OR fully in office
- it will require thought and planning to be successful
For experienced data scientists who have been in the field for years, fully remote work might be fine. We have broken in to the networks, learned the social norms, and have acquired skills (and, most importantly, channels for updating our skills) that don’t require special face time. I’d argue, though, that we do have a responsibility to lend a hand to those coming up after us, and that spending some purposeful time in person with junior peers represents giving back and is worthwhile.
How to Do It
I’m not going to prescribe exactly how intentional face time should be structured, because every company and organization is not the same, and one size doesn’t fit all. However, I have a few suggestions for the specific targets I’ve described above.
- Norms : Transmitting norms and culture is best done intentionally. Don’t just hope that your junior staff will immediately understand how interpersonal expectations work. We might not have had to spell these things out when everyone was spending 40+ hours a week side by side, but that may have changed. Be more explicit than you think you need to be. Some of the passive absorption of these norms will happen during intentional face time too.
- Skills : The key skills of data science are described in different ways depending on who you ask, but usually involve some mix of coding, statistics and machine learning, and business acumen and communication. These are all things that we develop and improve by doing the job, but we also gain these skills by observing how other people do the job. By creating collaborative working opportunities in your intentional face time, and not just focusing on tedious meetings, you can help this skill transmission happen.
- Networks : I honestly think building networks can be the hardest thing to do with remote working data scientists, because we so frequently do our day to day work alone. Even though you’ll have model and code reviews as a team, and may connect at stand ups, meetings, and hackathons, a lot of the network development in pre-Covid workplaces came from the peripheral socialization. Chats over the coffee machine are a cliche, but they actually do make opportunities for colleagues to get more familiar. That’s why sloppy and haphazard face time hybrid work is entirely different from purposeful face time — having interactions for social purposes during these designated on sites can go a long way to developing strong networks.
It seems like many employers are unsure how to bring on entry level data scientists and develop them into experienced practitioners in this new work world, so they default to trying to hire more senior people than their work really calls for. While this makes for more opportunities and demand for folks at my level, it’s not good for the field as a whole. We need to have new entrants to the discipline, bringing ideas and creativity, and we need to give them the tools to grow and succeed, even though we’re not in the same office all day the way we once were.
We need to have new entrants to the discipline, bringing ideas and creativity, and we need to give them the tools to grow and succeed.
Our task as established members of the data science profession is to firstly acknowledge that things are different now, and that’s okay. We can’t wish our way back to a different work world, and I wouldn’t want to. Flexibility in work makes our lives and our communities better. We just need to put forth the effort to identify what’s important, and figure out how we can achieve these goals in this new environment.
See more of my work at www.stephaniekirmer.com .
Creating New Data Scientists in the Age of Remote Work was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.