Image for post
Image for post

Introduction

What Is String Formatting?

String formatting is attractively designing our string using formatting techniques provided by the particular programming language. We have different string formatting techniques in Python. We are now going to explore the new f-string formatting technique.

f-string evaluates at runtime of the program. It’s swift compared to the previous methods.

f-string having an easy syntax compared to previous string formatting techniques of Python. We will look into every bit of this formatting using different examples.

Syntax

Every f-string statement consists of two parts, one is character f or F, & the next one is a string which…


Image for post
Image for post

A lot of us, while reading this tutorial, might think that there is nothing undiscovered about a simple Python Print function since we all would have started learning Python with the evergreen example of printing Hello, World!. It's also true that Python or, for that matter, any programming language, the Print function is the most basic & the baby step that we take while learning a particular language. …


Image for post
Image for post

A global interpreter lock (GIL) is a mechanism to apply a global lock on an interpreter. It is used in computer-language interpreters to synchronize & manage the execution of threads so that only one native thread (scheduled by the operating system) can execute at a time.

In a scenario where we have multiple threads, what can happen is that both the thread might try to acquire the memory at the same time, & as a result of which they would overwrite the data in the memory. Hence, arises a need to have a mechanism that could help prevent this phenomenon.


Image for post
Image for post

We will be learning about a built-in Python function called range() function. It is a very popular & widely used function in Python, especially when we are working with predominantly for loops & sometimes with while loops. It returns a sequence of numbers & is immutable (whose value is fixed). The range function takes one or at most three arguments, namely the start & a stop value along with a step size.

Range function was introduced only in Python3, while in Python2, a similar function xrange() was used, & it used to return a generator object & consumed less memory…


Image for post
Image for post

Array’s are the foundation for all data science in Python. Arrays can be multidimensional, & all elements in an array need to be of the same type, all integers or all floats.

Advantages of Using an Array

  • Arrays can handle very large datasets efficiently
  • Computationally-memory efficient
  • Faster calculations & analysis than lists
  • Diverse functionality (many functions in Python packages). With several Python packages that make trend modeling, statistics, & visualization easier

Basics of an Array

In Python, we can create new datatypes, called arrays using the NumPy package. NumPy arrays are optimized for numerical analyses & contain only a single data type.

We first import NumPy & then use…


Image for post
Image for post

pip is a standard package manager used to install & maintain packages for Python. The Python standard library comes with a collection of built-in functions & built-in packages.

Data science packages like scikit-learn & statsmodel are NOT part of the Python standard library. They can be installed through pip, the standard package manager for Python, via the command line.

Pip Documentation

Pip has a variety of commands & option flags designed to manage Python packages.


Removing duplicates is an essential skill to get accurate counts because we often don’t want to count the same thing multiple times. In Python, this could be accomplished by using the Pandas module, which has a method known as drop_duplicates.

Let’s understand how to use it with the help of a few examples.

Dropping Duplicate Names

Let’s say we have a DataFrame that contains vet visits, & the vet’s office wants to know how many dogs of each breed have visited their office. However, there are dogs like Max & Stella, who have visited the vet more than once in your dataset. …


Image for post
Image for post

We can write our very own Python functions using the def keyword, function headers, docstrings, & function bodies. However, there’s a quicker way to write functions on the fly, & these are called lambda functions because we use the keyword lambda.

Some function definitions are simple enough that they can be converted to a lambda function. By doing this, we write fewer lines of code, which is pretty awesome & will come in handy, especially when we’re writing & maintaining big programs.

Lambda Function

Here we rewrite our function raise_to_power as a lambda function. After the keyword lambda, we specify the names…


Image for post
Image for post

If we have a DataFrame & would like to access or select a specific few rows/columns from that DataFrame, we can use square brackets or other advanced methods such as loc & iloc.

Selecting Columns Using Square Brackets

Now suppose that we want to select the country column from the brics DataFrame. To achieve this, we will type brics & then the column label inside the square brackets.

Selecting a Column


Image for post
Image for post

We’ll learn techniques on how to clean messy data in SQL, which is a must-have skill for any Data Scientist

Real world data is almost always messy. As a data scientist or a data analyst or even as a developer, if you need to discover facts about data, it’s vital to ensure that data is tidy enough for doing that.

In this tutorial, we will be practicing some of the most common data cleaning techniques in SQL. We will create our own dummy dataset, but the techniques can be applied to the real world data (of the tabular form) as…

Jason Joseph

Data Scientist & Machine Learning Engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store