I really like the add-free, simple but eye-pleasing interface on Medium. It easily attracts any reader who comes to the platform to read something interesting. It also provides simple but useful tools for writers to create eye-pleasing contents.

I have been writing on Medium for months. Since I prefer Data Science and Machine Learning, I often write contents related to those fields. As a writer, I always try to improve the ** quality** of the contents by considering various aspects.

In this post, I will share 10 best practices that I always follow when writing contents on Medium. These best practices…

You cannot get the best out of your machine learning model without doing any hyperparameter optimization (tuning). The default hyperparameter values do not make the best model for your data. Sikit-learn — the Python machine learning library provides two special functions for hyperparameter optimization:

**GridSearchCV —**for Grid Search**RandomizedSearchCV —**for Random Search

If you’re new to Data Science and Machine Learning fields, you may be not familiar with these words. In this post, I’ll try to give more emphasis on *Python implementation of Grid Search and Random Search and explain the difference between them*. …

In linear algebra, a system of linear equations is defined as a collection of two or more *linear* equations having the same set of variables. All equations in the system are considered simultaneously. Systems of linear equations are used in different sectors such as Manufacturing, Marketing, Business, Transportation, etc.

The solving process of a system of linear equations will become more complicated when the number of equations and variables are increased. The solution must satisfy every equation in the system. In Python, **NumPy** (**Num**erical **Py**thon), **SciPy** (**Sci**entific **Py**thon) and **SymPy** (**Sym**bolic **Py**thon) libraries can be used to solve systems of…

Many people who are already data scientists or new to the field of data science are looking at an answer to the question ** “Will AutoML (Automated Machine Learning) replace data scientists?”** Asking a question like this is very reasonable because

AutoML will** ***NOT* replace your data science profession. It’s just here to make…

In both Statistics and Machine Learning, the number of attributes, features or input variables of a dataset is referred to as its **dimensionality**. For example, let’s take a very simple dataset containing 2 attributes called *Height* and *Weight*. This is a 2-dimensional dataset and any observation of this dataset can be plotted in a 2D plot.

** Multicollinearity** occurs when features (input variables) are highly correlated with one or more of the other features in the dataset. It affects the performance of regression and classification models. PCA (Principal Component Analysis) takes advantage of multicollinearity and combines the highly correlated variables into a set of uncorrelated variables. Therefore, PCA can effectively eliminate multicollinearity between features.

In this post, we’ll build a logistic regression model on a classification dataset called *breast_cancer *data. The initial model can be considered as the base model. Then, we’ll apply PCA on *breast_cancer *data and build the logistic regression model again. After that, we’ll…

Data scientist was the number 1 job role in 2020. However, in 2021, machine learning engineer is trending. There are still unfilled vacancies for data science professions in many countries.

Most people are studying data science and machine learning nowadays. Their ultimate goal will be getting a dream data science job. However, most of them don’t know the reality of a data science job as they are not dealing with real-world things while they’re learning the subject.

You may be familiar with different machine learning algorithms. You may also know the behind the scene process of each algorithm. You may…

Previously, I’ve published an article called “10 Real Truths about Machine Learning”. Today, I’ll list down 13 real-world insights in machine learning which are not within the list in the previous article. In this short (but useful) article, more emphasis will be given considering *real-world* insights.

Let’s go through the list. Wherever possible, I’ll add the links to my previous articles so that you can visit them to find more information on a specific point.

*Data Scientist*and*Machine Learning Engineer*are two completely different roles.- The machine learning engineer is trending and will be the number 1 job in…

If you’re learning Data Science and Machine Learning, you definitely need a laptop. This is because you need to write and run your own code to get hands-on experience. When you also consider portability, the laptop is the best option instead of a desktop.

A traditional laptop may not be perfect for your data science and machine learning tasks. You need to consider laptop specifications carefully to choose the right laptop. If you’re looking to buy a laptop for data science and machine learning tasks, this post is for you! …

In part 1 and part 2, we’ve learned how to inspect, describe and summarize a Pandas DataFrame. Today, we’ll learn how to extract a subset of a Pandas DataFrame. This is very useful because we often want to perform operations on subsets of our data. There are many different ways of subsetting a Pandas DataFrame. You may need to select specific columns with all rows. Sometimes, you want to select specific rows with all columns or select rows and columns that meet a specific criterion, etc.

All different ways of subsetting can be divided into 4 categories: **Selection**, **Slicing**, **Indexing…**

Data Analyst with Python || Bring data into actionable insights