The real value of a degree for aspiring data scientists

Russell Pollari
5 min read

And how to get ahead without one

"Do I need a graduate degree to get a job as a data scientist?"

I get asked some version of this question on a regular basis.

It's a fair assumption—given how often MSc. or Ph.D. degrees are listed in the requirements of job postings.

As a Ph.D. dropout, I want to answer with an emphatic no. I had a less than ideal experience as a grad student. And it did not prepare me at all for the job market.

But the full answer is a bit more complicated.

I don't think a graduate degree makes you better prepared for the job market—at least, not enough to justify the cost in both time and money. Yes, you will learn a lot. But you can learn just as much—if not more—from free online courses (having a good mentor helps too).

However, having a degree—the higher the better—gives you a leg up in the job market. And that's because education is valuable to employers as a signal. Brian Caplan makes this argument in The Case Against Education:

The earnings premium for college grads has rocketed to over 70%.… How could such a lucrative investment be wasteful? The answer is a single word… signaling. Even if what a student learned in school is utterly useless, employers will happily pay extra if their scholastic achievement provides information about their productivity.

Caplan's argument is that higher education doesn’t produce better workers. It just reveals preexisting traits that the job market values—intelligence, conscientiousness, and conformity.

It's the combination of those traits that is valuable to employers. It's very hard to signal all three without a degree. You can test a candidate's intelligence with a technical interview, for example, but it is much harder to test their level of conscientiousness and conformity. From Brian Caplan again:

You can’t discover a person’s true work ethic with a glance. You certainly can’t ask, “How good is your work ethic?” and expect candor… A signal doesn’t have to be definitive, just better than nothing.

So, do you need a graduate degree to get a job in data science?

Well, no. You don't need it. But it helps.

It won't necessarily make you a better employee, but it will be a useful signal to get your foot in the door.

What to do instead of a degree

There is a better signal than a degree—actual work experience.

If you’ve already worked elsewhere in a similar role, it’s a strong indicator that you know what you are doing. Mentees at SharpestMinds regularly report that the job search becomes significantly easier when they are looking for their second role. It’s the first one that’s the hurdle.

Technology, however, offers a workaround. For certain types of skills, you can gain experience without getting hired. I made this argument in a blog post for Towards Data Science:

For many professions, you need to land a job to start accumulating [work experierence]. But there is a wonderful difference when it comes to data science and machine learning—you can get plenty of experience before getting hired. How? By building things.

Proof-of-work, via a portfolio of projects, can help signal your competence as a data scientist. But remember, you also want to signal conformity and conscientiousness. You want to show that you can follow rules and conventions. And that you have a desire to do things well—with care and diligence. So don't just throw a bunch of Jupyter notebooks and capstone projects from MOOCs and bootcamps up on GitHub and call it a day. Put some care into the things you build.

You can signal conscientiousness by paying attention to the details of your projects. Notebooks are good for exploration and prototyping—but not for iterating on and collaborating. Go the extra mile and write clean, modular code. Organise your files into a reasonable folder structure.

Have a nice README on your projects—include instructions on running the code and contributing to the code base. You'll get major bonus points from me if I can clone your repo and get it running on my machine without much headache.

Look out for hard-coded paths and keys. Make the environment easy to reproduce. Having a requirements.txt is good enough for most hobby projects, but still signals that you put some thought into environment reproducability.

A lot of the above can also signal conformity. You should show that you employ best practices and understand how to use common tools and frameworks. Look at some popular open-source libraries and mimic their project structure and documentation. Make your code conform to the PEP8 style guide. Write good commit messages.

A completed degree is useful as a signal because it takes multiple years to complete. It shows potential employers that you can start a long-term project and see it through. This is why a portfolio with a single, well-maintained project is much more valuable than one with dozens of abandoned projects and prototypes.

Build something end-to-end and put continuous work into it. Treat it like a product. You want to signal to employers that you can set your goal on a problem—one that mimics an actual business problem—and see it through over a long period. Build a nice-looking front-end. Share it with the world. Make regular contributions to show you have not abandoned it.

Go beyond the "Initial commit".

Russell Pollari
Follow me on Twitter