Sign in

Statistics student writing about Data Science. Reach out to me @ linktr.ee/benthecoder

Data Science | Machine Learning | Artificial Intelligence

Here is a list of the best free online courses on Data Science, Machine Learning, Deep Learning, and Artificial Intelligence.

Photo by Tianyi Ma on Unsplash

Formal education in the 21st century has transformed into a choice instead of a mandatory step in life. With the internet boom and the rise of Massive Open Online Courses (MOOCs), one can opt for learning data science online and avoid the burden of student debt. Statistics show that eLearning enables students to learn 5x more material for every hour of training. The benefits of online learning are limitless — from the cost-cutting aspect to the flexible schedule and environment.

The democratization of Data Science

It’s the year 2020, and data science is more democratized than ever. This means any individual can do data science…


Deep Learning

An easy step-by-step guide to building a convolutional neural network with PyTorch.

Credit: https://unsplash.com/@dorner
Credit: https://unsplash.com/@dorner

Look at the image above. What do you see? You might think this is a trivial question — clearly, it’s a yellow banana casting a shadow on a pink background. But how might computers recognize this image without having eyes and brains like humans do?

Computers aren’t able to infer about the world intuitively like we do, so for computers to “see” and recognize objects, scientists have to crack the complex system that is the human brain and implement it onto a computer. As children, we have an innate curiosity to explore and experiment with the world. …


Machine Learning

Exploring the most popular data science methods and their applications.

There are so many new data science algorithms and methods coming out every year that it can be overwhelming to learn all of them. Luckily, we have the State of Machine Learning and Data Science 2020 report, which was published by Kaggle in December last year, filled with statistics about Data Scientists from around the world including age breakdowns, educational background, programming skills, and salary information.

This article will focus on the most popular machine learning (ML) algorithms, explaining each method and the idea behind them while providing examples of their applications along with other helpful articles detailing the code…


Data Science

A thorough guide to building a simple XGBoost machine learning model for a data science competition.

The factors that determine whether a video goes viral are hard to narrow down, especially since popularity can be so subjective. What if we could use objective factors such as video metadata and thumbnails to predict how many views a video will get? This is the goal of the Video Popularity Prediction Challenge hosted on the data science competition platform Bitgrit.

In any data science competition, it’s important to start by posing some questions to better understand the problem, the goal, and the data you’re given. So, let’s start by asking these questions:

What is the goal?

The goal of this challenge is to…


Data Science competitionS

Ready to take your skills to the next level at these competition platforms?

Photo by Boicu Andrei on Unsplash

I wrote about “10 Data Science Competitions for you to hone your skills for 2020” previously in 2019, so I thought this would be a good time to update it for 2021. The list will be quite similar, with some new additions. So check that out if you haven’t already!

Why Data Science Competitions

Data Science requires utilizing statistical techniques and machine learning algorithms to work with a lot of data, and to effectively do so you need a lot of practice.

One great opportunity to practice is with data competitions. They serve as a platform to learn the best practices, gain feedback, and…


Key principles for creating useful and informative graphics

graphic visualization of the global emissions of the world’s countries
graphic visualization of the global emissions of the world’s countries
Our World in Data

Data visualization is an essential step in the data science process. It’s where you present your findings and communicate the results of your analysis in a graphical format that is both intuitive and easy to understand.

Data visualization takes hard work, and tons of cleaning and analysis go into distilling and manipulating messy data into beautiful graphs and charts. But even with data prepared, one still has to adhere to certain principles or methodologies to produce useful and informative graphics.

That said, in this article, I will be taking inspiration from Edward Tufte’s book Beautiful Evidence, which encapsulates six principles…


A collection of free resources to learn R for data science, machine learning, and statistics

partially opened Apple tablet
partially opened Apple tablet
Photo by Nagy Arnold on Unsplash

Ever since the advent of personal computers and the internet — and now with social networking applications, streaming services, and ecommerce services with millions of users each day — the amount of data has clearly skyrocketed.

These terabytes of data stored in data centers or the cloud have a massive potential to generate useful insights and are highly valuable. The act of analyzing these data and extracting useful information from them has become a popular industry in itself.

The Art of Data Analysis

The art and science of data analysis, which combines multiple disciplines and methodologies, is becoming more and more accessible and democratized.

The…


Deep Learning

Training Neural Networks could soon be done on smartphones.

Photo by Tyler Lastovich on Unsplash

If you’ve read about OpenAI’s GPT-3 model, you would know that it was trained on massive amounts of data (175 billion machine learning parameters) over a period of time. But what you may not know is the side effects of training this model.

Training models is an incredibly energy-intensive task, it eats up huge amounts of data and computation, which results in major electricity consumption. …


Data Cleaning & Tidying

A must-know concept for Data Scientists.

Photo by Hunter Harritt on Unsplash

Introduction

There’s a popular saying in Data Science that goes like this — “Data Scientists spend up to 80% of the time on data cleaning and 20 percent of their time on actual data analysis”. The origin of this quote goes back to 2003, in Dasu and Johnson’s book, Exploratory Data Mining and Data Cleaning, and it still true to this day.

In a typical Data Science project, from importing your data to communicating your results, tidying your data is a crucial aspect in making your workflow more productive and efficient.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store