Rich Data     About/Imprint     CV     Publications     Blog Archive     Blog Feed

Starting off in data science

200 lines of code (R)

A little more than a year ago, I decided to pursue a career in data science. Today, I work as an educational data scientist for StackFuel, a small start-up in Berlin. How did I do it?

alt text

Timeline to data science: a year in the life of Richard Kunert

In September 2016 I left the Netherlands where I had pursued a PhD in cognitive neuroscience in order to live in Berlin. I wasn’t too sure why Berlin and what I would do there. But I figured that if there was one city where directionless, unemployed, highly skilled people come together, it’s Berlin. After a few months of exploring the city’s various boulder gyms and nightlife venues, I decided that my new direction in life would be data science.

So, I asked friends with a similar background who went into data science what I should learn. This is what I still have in my mind from their advice:

  • python
  • SQL
  • machine learning

So, by March 2017 I had finished the free codecademy course on python. I finished the three SQL courses by codecademy somewhat later in October. I picked up machine learning along the way and now follow a free HPIopen course on the topic.

In June 2017 I felt ready to hit the job market. What followed is best summarised by this figure: alt text Notice the increasing steepness of the curve. It reflects the increasing frequency with which I applied for jobs. In the beginning, I applied for two jobs per month. By the end, I applied for two jobs per day.

Application processes took on average 18 days, but the variability was huge (range 3 to 46 days). By November 2017, one particular 22 day application process at StackFuel resulted in my first job offer. At that point I still had six applications running, including three at the interview stage, but I withdrew from all of them. By December my transition to data science was complete.

Facilitating the transition to data science: blogging and freelancing

During my job hunt I was very surprised to find out that recruiters in Germany have the following profile in mind when they think of a typical data scientist:

  • studied statistics/economics
  • worked in marketing
  • knows the ins and outs of machine learning
  • does not necessarily know how to program

Notice how the ‘science’ part in ‘data science’ is irrelevant. I decided to completely abandon relying on my scientific credentials to get a job. They only play a minor role in the CVs I sent out. Everything I did to score the next job in academia was more or less in vain for scoring a job in industry. I won’t deny that that was a tough pill to swallow. But I did.

Instead, I figured that I needed to convince recruiters of my skills without referring to my academic work. I did this in two ways. First, I started this very data science blog as a portfolio of what I can do. At every job interview I attended I got the impression that it received at least a cursory glance by the interviewer.

Second, I went freelance. Through some contacts I managed to convince the Humboldt University to give me a small text mining and classification assignment, as a paid freelance job. I worked on this from October 2017 onwards. It was instrumental in convincing employers that I am a data scientist, rather than becoming one through their company. I really liked working as a freelance data scientist and I want to pursue this route while being an employee, but first and foremost, I wanted a job.

Dealing with rejection

Only 4% (1/25) of my data science applications resulted in a job offer. Compare this to the 1/3 rate I had when applying for a post doc in academia. (I turned down that one offer.)

All application processes consisted of two interview stages, on average 17.5 days apart (range 10 to 26 days). Often I got a technical or business task in between. Presenting it would form the basis for the second interview. How good was I to reach each application stage? This figure shows my data science application funnel, i.e. a markov chain with transition probabilities: alt text

The ideal path is the central axis. You move from start to two job interviews, possibly with a task in between, and on to the job offer. The dots at each stage become smaller, representing the decreasing number of applications which went so far. The arrows connecting dots are sized according to the probability of moving from one application stage to the other.

Notice that scoring a job interview at all is the biggest hurdle. Of the 28% (7/25) of applications which advanced to the first interview, most also advanced to the second interview stage. Half of the applications at the second interview stage got me a job interview or I withdrew from the application.

Again: scoring the first job interview is the biggest bottleneck to working as a data scientist. The entire application material you send to your future employer should be 100% designed to getting that interview.

Top tips for securing a data science job

  1. take free online courses to improve your practical data science skills
  2. forget about your uni achievements
  3. show off your practical data science skills on your blog
  4. work as a freelance data scientist
  5. optimise your application material for securing a job interview
  6. don’t be too choosy with your first job

Welcoming a new generation of data scientists

What I describe is just my own case. However, my ex-colleague Gwilym had a similar experience in London. You can read his story on his excellent data science blog here. So, who knows, my case might generalise to others.

My new job actually involves training new data scientists. StackFuel is an educational technology start-up involved in further education. I hope that this blog post can convince others that a career switch to data science is feasible. After all, even I managed it.

The figures were made in R. The script is here.

Like this post? Share it with your followers or follow me on Twitter!