Rich Data

Using Quantum Computing To Optimize Coin Mining

It seems that one of the biggest conundrums associated with coin mining are related to efficacy of hash rate vis-à-vis electricity costs. Our argument is that in order use a cost-effective miner one would need to create a “static-free” mining environment around the SHA-256, a cryptographic hash function. SHA-256 is the name of a technology …

Using Quantum Computing To Optimize Coin Mining Read More »

Transforming from one scale to another

Leave a Comment / Info / By admin

Converting data from one scale to another is such a common task for data scientists. This blog post is beyond sklearn’s capabilities. I have always lacked one specific scaler, so in this post I am writing it myself – ScoreScaler. Consolidating data despite different scales It is very common to compare scores or product ratings. …

Transforming from one scale to another Read More »

How close are German political parties to each other? Using R to derive the latent semantic network of German election manifestos

Leave a Comment / Info / By admin

On September 24, the Germans will elect a new federal parliament. In this lesson, I text the campaign manifestos of the main parties, extract the hidden semantic space and visualize it to see who is closer to whom in German politics. How close are Germany’s election manifestos in meaning? If you want to know what …

How close are German political parties to each other? Using R to derive the latent semantic network of German election manifestos Read More »

Using Python, the IMDb API, and web-scraping rotten tomatoes to find the best Star Trek movie

Leave a Comment / Info / By admin

Star Trek is a very rare positive vision of the future. Which film attracts the audience the most with this encouraging message? I studied python to try every movie ratings I could find to answer this question. If you follow this guide, so can you. Which Star Trek movie is the best? The short answer …

Using Python, the IMDb API, and web-scraping rotten tomatoes to find the best Star Trek movie Read More »

The surprisingly good performance of dumb classification algorithms

Leave a Comment / Info / By admin

When evaluating binary classification algorithms, it is recommended to have a baseline for performance metrics. In this blog post, I am calculating the classification efficiency of really stupid classifiers. These models do not use any feature information. If your own classification model works the same way, then a problem arises. Dumb classifiers When assessing how …

The surprisingly good performance of dumb classification algorithms Read More »

The tricky question of how long it takes for Corona cases to double

Leave a Comment / Info / By admin

The doubling time of Covid-19 cases has become one of the key indicators of the corona pandemic. Political decision makers use this number to decide when isolation measures should be relaxed. In this post, I show that different assumptions about the virus outbreak lead to different doubling time estimates. Which number should you trust? Covid-19 …

The tricky question of how long it takes for Corona cases to double Read More »

An analysis of the rental bike market in Berlin

Leave a Comment / Info / By admin

2018 was a wild summer for the bike rental market in Berlin. In the beginning, many new bicycle systems were introduced to the market. By now, two have already left. In the meantime, I’ve counted every rental bike I saw. Which bikes are rented the most? Berlin bike rental market in 2018 On May 7, …

An analysis of the rental bike market in Berlin Read More »

Starting off in data science

Leave a Comment / Info / By admin

A little over a year ago, I decided to pursue a career in data science. Today I work as an Educational Data Scientist at StackFuel, a small startup in Berlin. How did I do it? Timeline of Evolution of Data Science: A Year in the Life In September 2016, I left the Netherlands, where I …

Starting off in data science Read More »

Predicting typical completion rates of online courses

Leave a Comment / Info / By admin

Massive Open Online Courses (MOOCs) have not revolutionized education. Why? They suffer from terrible completion rates. Most students start a MOOC before completing it. In this blog post, I’ll take a look at what my own company’s eLearning completion rates would be if we offered standard MOOCs. How many people go through MOOCs? When I …

Predicting typical completion rates of online courses Read More »

Exponentially scaling your data in order to zoom in on small differences

Leave a Comment / Info / By admin

Machine learning models benefit from scaling up the area of the scale where most data points show differences. In this blog post, I present an exponential scaler that does just that. It increases the lower or upper end of the scale to focus the machine learning model on the differences that matter most. Design a …

Exponentially scaling your data in order to zoom in on small differences Read More »