UnicodePlot: Plot Your Data By Unicode Characters for Ruby

UnicodePlot - Plot your data by Unicode characters

UnicodePlot provides the feature to make charts with Unicode characters.

Install

$ gem install unicode_plot

Usage

require 'unicode_plot'

x = 0.step(3*Math::PI, by: 3*Math::PI / 30)
y_sin = x.map {|xi| Math.sin(xi) }
y_cos = x.map {|xi| Math.cos(xi) }
plot = UnicodePlot.lineplot(x, y_sin, name: "sin(x)", width: 40, height: 10)
UnicodePlot.lineplot!(plot, x, y_cos, name: "cos(x)")
plot.render

You can get the results below by running the above script:

Supported charts

barplot

UnicodePlot.barplot(data: {'foo': 20, 'bar': 50}, title: "Bar").render

boxplot

UnicodePlot.boxplot(data: {foo: [1, 3, 5], bar: [3, 5, 7]}, title: "Box").render

densityplot

x = Array.new(500) { 20*rand - 10 } + Array.new(500) { 6*rand - 3 }
y = Array.new(1000) { 30*rand - 10 }
UnicodePlot.densityplot(x, y, title: "Density").render

histogram

x = Array.new(100) { rand(10) } + Array.new(100) { rand(30) + 10 }
UnicodePlot.histogram(x, title: "Histogram").render

lineplot

See Usage section above.

scatterplot

x = Array.new(50) { rand(20) - 10 }
y = x.map {|xx| xx*rand(30) - 10 }
UnicodePlot.scatterplot(x, y, title: "Scatter").render

Acknowledgement

This library is strongly inspired by UnicodePlot.jl.

Documentation

https://red-data-tools.github.io/unicode_plot.rb/

License

MIT License

Author


Author: red-data-tools
Source code: https://github.com/red-data-tools/unicode_plot.rb
License: MIT license

#ruby 

What is GEEK

Buddha Community

UnicodePlot: Plot Your Data By Unicode Characters for Ruby
Anil  Sakhiya

Anil Sakhiya

1652748716

Exploratory Data Analysis(EDA) with Python

Exploratory Data Analysis Tutorial | Basics of EDA with Python

Exploratory data analysis is used by data scientists to analyze and investigate data sets and summarize their main characteristics, often employing data visualization methods. It helps determine how best to manipulate data sources to get the answers you need, making it easier for data scientists to discover patterns, spot anomalies, test a hypothesis, or check assumptions. EDA is primarily used to see what data can reveal beyond the formal modeling or hypothesis testing task and provides a better understanding of data set variables and the relationships between them. It can also help determine if the statistical techniques you are considering for data analysis are appropriate or not.

🔹 Topics Covered:
00:00:00 Basics of EDA with Python
01:40:10 Multiple Variate Analysis
02:30:26 Outlier Detection
03:44:48 Cricket World Cup Analysis using Exploratory Data Analysis


Learning the basics of Exploratory Data Analysis using Python with Numpy, Matplotlib, and Pandas.

What is Exploratory Data Analysis(EDA)?

If we want to explain EDA in simple terms, it means trying to understand the given data much better, so that we can make some sense out of it.

We can find a more formal definition in Wikipedia.

In statistics, exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task.

EDA in Python uses data visualization to draw meaningful patterns and insights. It also involves the preparation of data sets for analysis by removing irregularities in the data.

Based on the results of EDA, companies also make business decisions, which can have repercussions later.

  • If EDA is not done properly then it can hamper the further steps in the machine learning model building process.
  • If done well, it may improve the efficacy of everything we do next.

In this article we’ll see about the following topics:

  1. Data Sourcing
  2. Data Cleaning
  3. Univariate analysis
  4. Bivariate analysis
  5. Multivariate analysis

1. Data Sourcing

Data Sourcing is the process of finding and loading the data into our system. Broadly there are two ways in which we can find data.

  1. Private Data
  2. Public Data

Private Data

As the name suggests, private data is given by private organizations. There are some security and privacy concerns attached to it. This type of data is used for mainly organizations internal analysis.

Public Data

This type of Data is available to everyone. We can find this in government websites and public organizations etc. Anyone can access this data, we do not need any special permissions or approval.

We can get public data on the following sites.

The very first step of EDA is Data Sourcing, we have seen how we can access data and load into our system. Now, the next step is how to clean the data.

2. Data Cleaning

After completing the Data Sourcing, the next step in the process of EDA is Data Cleaning. It is very important to get rid of the irregularities and clean the data after sourcing it into our system.

Irregularities are of different types of data.

  • Missing Values
  • Incorrect Format
  • Incorrect Headers
  • Anomalies/Outliers

To perform the data cleaning we are using a sample data set, which can be found here.

We are using Jupyter Notebook for analysis.

First, let’s import the necessary libraries and store the data in our system for analysis.

#import the useful libraries.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

# Read the data set of "Marketing Analysis" in data.
data= pd.read_csv("marketing_analysis.csv")

# Printing the data
data

Now, the data set looks like this,

If we observe the above dataset, there are some discrepancies in the Column header for the first 2 rows. The correct data is from the index number 1. So, we have to fix the first two rows.

This is called Fixing the Rows and Columns. Let’s ignore the first two rows and load the data again.

#import the useful libraries.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

# Read the file in data without first two rows as it is of no use.
data = pd.read_csv("marketing_analysis.csv",skiprows = 2)

#print the head of the data frame.
data.head()

Now, the dataset looks like this, and it makes more sense.

Dataset after fixing the rows and columns

Following are the steps to be taken while Fixing Rows and Columns:

  1. Delete Summary Rows and Columns in the Dataset.
  2. Delete Header and Footer Rows on every page.
  3. Delete Extra Rows like blank rows, page numbers, etc.
  4. We can merge different columns if it makes for better understanding of the data
  5. Similarly, we can also split one column into multiple columns based on our requirements or understanding.
  6. Add Column names, it is very important to have column names to the dataset.

Now if we observe the above dataset, the customerid column has of no importance to our analysis, and also the jobedu column has both the information of job and education in it.

So, what we’ll do is, we’ll drop the customerid column and we’ll split the jobedu column into two other columns job and education and after that, we’ll drop the jobedu column as well.

# Drop the customer id as it is of no use.
data.drop('customerid', axis = 1, inplace = True)

#Extract job  & Education in newly from "jobedu" column.
data['job']= data["jobedu"].apply(lambda x: x.split(",")[0])
data['education']= data["jobedu"].apply(lambda x: x.split(",")[1])

# Drop the "jobedu" column from the dataframe.
data.drop('jobedu', axis = 1, inplace = True)

# Printing the Dataset
data

Now, the dataset looks like this,

Dropping Customerid and jobedu columns and adding job and education columns

Missing Values

If there are missing values in the Dataset before doing any statistical analysis, we need to handle those missing values.

There are mainly three types of missing values.

  1. MCAR(Missing completely at random): These values do not depend on any other features.
  2. MAR(Missing at random): These values may be dependent on some other features.
  3. MNAR(Missing not at random): These missing values have some reason for why they are missing.

Let’s see which columns have missing values in the dataset.

# Checking the missing values
data.isnull().sum()

The output will be,

As we can see three columns contain missing values. Let’s see how to handle the missing values. We can handle missing values by dropping the missing records or by imputing the values.

Drop the missing Values

Let’s handle missing values in the age column.

# Dropping the records with age missing in data dataframe.
data = data[~data.age.isnull()].copy()

# Checking the missing values in the dataset.
data.isnull().sum()

Let’s check the missing values in the dataset now.

Let’s impute values to the missing values for the month column.

Since the month column is of an object type, let’s calculate the mode of that column and impute those values to the missing values.

# Find the mode of month in data
month_mode = data.month.mode()[0]

# Fill the missing values with mode value of month in data.
data.month.fillna(month_mode, inplace = True)

# Let's see the null values in the month column.
data.month.isnull().sum()

Now output is,

# Mode of month is
'may, 2017'
# Null values in month column after imputing with mode
0

Handling the missing values in the Response column. Since, our target column is Response Column, if we impute the values to this column it’ll affect our analysis. So, it is better to drop the missing values from Response Column.

#drop the records with response missing in data.
data = data[~data.response.isnull()].copy()
# Calculate the missing values in each column of data frame
data.isnull().sum()

Let’s check whether the missing values in the dataset have been handled or not,

All the missing values have been handled

We can also, fill the missing values as ‘NaN’ so that while doing any statistical analysis, it won’t affect the outcome.

Handling Outliers

We have seen how to fix missing values, now let’s see how to handle outliers in the dataset.

Outliers are the values that are far beyond the next nearest data points.

There are two types of outliers:

  1. Univariate outliers: Univariate outliers are the data points whose values lie beyond the range of expected values based on one variable.
  2. Multivariate outliers: While plotting data, some values of one variable may not lie beyond the expected range, but when you plot the data with some other variable, these values may lie far from the expected value.

So, after understanding the causes of these outliers, we can handle them by dropping those records or imputing with the values or leaving them as is, if it makes more sense.

Standardizing Values

To perform data analysis on a set of values, we have to make sure the values in the same column should be on the same scale. For example, if the data contains the values of the top speed of different companies’ cars, then the whole column should be either in meters/sec scale or miles/sec scale.

Now, that we are clear on how to source and clean the data, let’s see how we can analyze the data.

3. Univariate Analysis

If we analyze data over a single variable/column from a dataset, it is known as Univariate Analysis.

Categorical Unordered Univariate Analysis:

An unordered variable is a categorical variable that has no defined order. If we take our data as an example, the job column in the dataset is divided into many sub-categories like technician, blue-collar, services, management, etc. There is no weight or measure given to any value in the ‘job’ column.

Now, let’s analyze the job category by using plots. Since Job is a category, we will plot the bar plot.

# Let's calculate the percentage of each job status category.
data.job.value_counts(normalize=True)

#plot the bar graph of percentage job categories
data.job.value_counts(normalize=True).plot.barh()
plt.show()

The output looks like this,

By the above bar plot, we can infer that the data set contains more number of blue-collar workers compared to other categories.

Categorical Ordered Univariate Analysis:

Ordered variables are those variables that have a natural rank of order. Some examples of categorical ordered variables from our dataset are:

  • Month: Jan, Feb, March……
  • Education: Primary, Secondary,……

Now, let’s analyze the Education Variable from the dataset. Since we’ve already seen a bar plot, let’s see how a Pie Chart looks like.

#calculate the percentage of each education category.
data.education.value_counts(normalize=True)

#plot the pie chart of education categories
data.education.value_counts(normalize=True).plot.pie()
plt.show()

The output will be,

By the above analysis, we can infer that the data set has a large number of them belongs to secondary education after that tertiary and next primary. Also, a very small percentage of them have been unknown.

This is how we analyze univariate categorical analysis. If the column or variable is of numerical then we’ll analyze by calculating its mean, median, std, etc. We can get those values by using the describe function.

data.salary.describe()

The output will be,

4. Bivariate Analysis

If we analyze data by taking two variables/columns into consideration from a dataset, it is known as Bivariate Analysis.

a) Numeric-Numeric Analysis:

Analyzing the two numeric variables from a dataset is known as numeric-numeric analysis. We can analyze it in three different ways.

  • Scatter Plot
  • Pair Plot
  • Correlation Matrix

Scatter Plot

Let’s take three columns ‘Balance’, ‘Age’ and ‘Salary’ from our dataset and see what we can infer by plotting to scatter plot between salary balance and age balance

#plot the scatter plot of balance and salary variable in data
plt.scatter(data.salary,data.balance)
plt.show()

#plot the scatter plot of balance and age variable in data
data.plot.scatter(x="age",y="balance")
plt.show()

Now, the scatter plots looks like,

Pair Plot

Now, let’s plot Pair Plots for the three columns we used in plotting Scatter plots. We’ll use the seaborn library for plotting Pair Plots.

#plot the pair plot of salary, balance and age in data dataframe.
sns.pairplot(data = data, vars=['salary','balance','age'])
plt.show()

The Pair Plot looks like this,

Correlation Matrix

Since we cannot use more than two variables as x-axis and y-axis in Scatter and Pair Plots, it is difficult to see the relation between three numerical variables in a single graph. In those cases, we’ll use the correlation matrix.

# Creating a matrix using age, salry, balance as rows and columns
data[['age','salary','balance']].corr()

#plot the correlation matrix of salary, balance and age in data dataframe.
sns.heatmap(data[['age','salary','balance']].corr(), annot=True, cmap = 'Reds')
plt.show()

First, we created a matrix using age, salary, and balance. After that, we are plotting the heatmap using the seaborn library of the matrix.

b) Numeric - Categorical Analysis

Analyzing the one numeric variable and one categorical variable from a dataset is known as numeric-categorical analysis. We analyze them mainly using mean, median, and box plots.

Let’s take salary and response columns from our dataset.

First check for mean value using groupby

#groupby the response to find the mean of the salary with response no & yes separately.
data.groupby('response')['salary'].mean()

The output will be,

There is not much of a difference between the yes and no response based on the salary.

Let’s calculate the median,

#groupby the response to find the median of the salary with response no & yes separately.
data.groupby('response')['salary'].median()

The output will be,

By both mean and median we can say that the response of yes and no remains the same irrespective of the person’s salary. But, is it truly behaving like that, let’s plot the box plot for them and check the behavior.

#plot the box plot of salary for yes & no responses.
sns.boxplot(data.response, data.salary)
plt.show()

The box plot looks like this,

As we can see, when we plot the Box Plot, it paints a very different picture compared to mean and median. The IQR for customers who gave a positive response is on the higher salary side.

This is how we analyze Numeric-Categorical variables, we use mean, median, and Box Plots to draw some sort of conclusions.

c) Categorical — Categorical Analysis

Since our target variable/column is the Response rate, we’ll see how the different categories like Education, Marital Status, etc., are associated with the Response column. So instead of ‘Yes’ and ‘No’ we will convert them into ‘1’ and ‘0’, by doing that we’ll get the “Response Rate”.

#create response_rate of numerical data type where response "yes"= 1, "no"= 0
data['response_rate'] = np.where(data.response=='yes',1,0)
data.response_rate.value_counts()

The output looks like this,

Let’s see how the response rate varies for different categories in marital status.

#plot the bar graph of marital status with average value of response_rate
data.groupby('marital')['response_rate'].mean().plot.bar()
plt.show()

The graph looks like this,

By the above graph, we can infer that the positive response is more for Single status members in the data set. Similarly, we can plot the graphs for Loan vs Response rate, Housing Loans vs Response rate, etc.

5. Multivariate Analysis

If we analyze data by taking more than two variables/columns into consideration from a dataset, it is known as Multivariate Analysis.

Let’s see how ‘Education’, ‘Marital’, and ‘Response_rate’ vary with each other.

First, we’ll create a pivot table with the three columns and after that, we’ll create a heatmap.

result = pd.pivot_table(data=data, index='education', columns='marital',values='response_rate')
print(result)

#create heat map of education vs marital vs response_rate
sns.heatmap(result, annot=True, cmap = 'RdYlGn', center=0.117)
plt.show()

The Pivot table and heatmap looks like this,

Based on the Heatmap we can infer that the married people with primary education are less likely to respond positively for the survey and single people with tertiary education are most likely to respond positively to the survey.

Similarly, we can plot the graphs for Job vs marital vs response, Education vs poutcome vs response, etc.

Conclusion

This is how we’ll do Exploratory Data Analysis. Exploratory Data Analysis (EDA) helps us to look beyond the data. The more we explore the data, the more the insights we draw from it. As a data analyst, almost 80% of our time will be spent understanding data and solving various business problems through EDA.

Thank you for reading and Happy Coding!!!

#dataanalysis #python

Dylan  Iqbal

Dylan Iqbal

1561523460

Matplotlib Cheat Sheet: Plotting in Python

This Matplotlib cheat sheet introduces you to the basics that you need to plot your data with Python and includes code samples.

Data visualization and storytelling with your data are essential skills that every data scientist needs to communicate insights gained from analyses effectively to any audience out there. 

For most beginners, the first package that they use to get in touch with data visualization and storytelling is, naturally, Matplotlib: it is a Python 2D plotting library that enables users to make publication-quality figures. But, what might be even more convincing is the fact that other packages, such as Pandas, intend to build more plotting integration with Matplotlib as time goes on.

However, what might slow down beginners is the fact that this package is pretty extensive. There is so much that you can do with it and it might be hard to still keep a structure when you're learning how to work with Matplotlib.   

DataCamp has created a Matplotlib cheat sheet for those who might already know how to use the package to their advantage to make beautiful plots in Python, but that still want to keep a one-page reference handy. Of course, for those who don't know how to work with Matplotlib, this might be the extra push be convinced and to finally get started with data visualization in Python. 

You'll see that this cheat sheet presents you with the six basic steps that you can go through to make beautiful plots. 

Check out the infographic by clicking on the button below:

Python Matplotlib cheat sheet

With this handy reference, you'll familiarize yourself in no time with the basics of Matplotlib: you'll learn how you can prepare your data, create a new plot, use some basic plotting routines to your advantage, add customizations to your plots, and save, show and close the plots that you make.

What might have looked difficult before will definitely be more clear once you start using this cheat sheet! Use it in combination with the Matplotlib Gallery, the documentation.

Matplotlib 

Matplotlib is a Python 2D plotting library which produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms.

Prepare the Data 

1D Data 

>>> import numpy as np
>>> x = np.linspace(0, 10, 100)
>>> y = np.cos(x)
>>> z = np.sin(x)

2D Data or Images 

>>> data = 2 * np.random.random((10, 10))
>>> data2 = 3 * np.random.random((10, 10))
>>> Y, X = np.mgrid[-3:3:100j, -3:3:100j]
>>> U = 1 X** 2 + Y
>>> V = 1 + X Y**2
>>> from matplotlib.cbook import get_sample_data
>>> img = np.load(get_sample_data('axes_grid/bivariate_normal.npy'))

Create Plot

>>> import matplotlib.pyplot as plt

Figure 

>>> fig = plt.figure()
>>> fig2 = plt.figure(figsize=plt.figaspect(2.0))

Axes 

>>> fig.add_axes()
>>> ax1 = fig.add_subplot(221) #row-col-num
>>> ax3 = fig.add_subplot(212)
>>> fig3, axes = plt.subplots(nrows=2,ncols=2)
>>> fig4, axes2 = plt.subplots(ncols=3)

Save Plot 

>>> plt.savefig('foo.png') #Save figures
>>> plt.savefig('foo.png',  transparent=True) #Save transparent figures

Show Plot

>>> plt.show()

Plotting Routines 

1D Data 

>>> fig, ax = plt.subplots()
>>> lines = ax.plot(x,y) #Draw points with lines or markers connecting them
>>> ax.scatter(x,y) #Draw unconnected points, scaled or colored
>>> axes[0,0].bar([1,2,3],[3,4,5]) #Plot vertical rectangles (constant width)
>>> axes[1,0].barh([0.5,1,2.5],[0,1,2]) #Plot horiontal rectangles (constant height)
>>> axes[1,1].axhline(0.45) #Draw a horizontal line across axes
>>> axes[0,1].axvline(0.65) #Draw a vertical line across axes
>>> ax.fill(x,y,color='blue') #Draw filled polygons
>>> ax.fill_between(x,y,color='yellow') #Fill between y values and 0

2D Data 

>>> fig, ax = plt.subplots()
>>> im = ax.imshow(img, #Colormapped or RGB arrays
      cmap= 'gist_earth', 
      interpolation= 'nearest',
      vmin=-2,
      vmax=2)
>>> axes2[0].pcolor(data2) #Pseudocolor plot of 2D array
>>> axes2[0].pcolormesh(data) #Pseudocolor plot of 2D array
>>> CS = plt.contour(Y,X,U) #Plot contours
>>> axes2[2].contourf(data1) #Plot filled contours
>>> axes2[2]= ax.clabel(CS) #Label a contour plot

Vector Fields 

>>> axes[0,1].arrow(0,0,0.5,0.5) #Add an arrow to the axes
>>> axes[1,1].quiver(y,z) #Plot a 2D field of arrows
>>> axes[0,1].streamplot(X,Y,U,V) #Plot a 2D field of arrows

Data Distributions 

>>> ax1.hist(y) #Plot a histogram
>>> ax3.boxplot(y) #Make a box and whisker plot
>>> ax3.violinplot(z)  #Make a violin plot

Plot Anatomy & Workflow 

Plot Anatomy 

 y-axis      

                           x-axis 

Workflow 

The basic steps to creating plots with matplotlib are:

1 Prepare Data
2 Create Plot
3 Plot
4 Customized Plot
5 Save Plot
6 Show Plot

>>> import matplotlib.pyplot as plt
>>> x = [1,2,3,4]  #Step 1
>>> y = [10,20,25,30] 
>>> fig = plt.figure() #Step 2
>>> ax = fig.add_subplot(111) #Step 3
>>> ax.plot(x, y, color= 'lightblue', linewidth=3)  #Step 3, 4
>>> ax.scatter([2,4,6],
          [5,15,25],
          color= 'darkgreen',
          marker= '^' )
>>> ax.set_xlim(1, 6.5)
>>> plt.savefig('foo.png' ) #Step 5
>>> plt.show() #Step 6

Close and Clear 

>>> plt.cla()  #Clear an axis
>>> plt.clf(). #Clear the entire figure
>>> plt.close(). #Close a window

Plotting Customize Plot 

Colors, Color Bars & Color Maps 

>>> plt.plot(x, x, x, x**2, x, x** 3)
>>> ax.plot(x, y, alpha = 0.4)
>>> ax.plot(x, y, c= 'k')
>>> fig.colorbar(im, orientation= 'horizontal')
>>> im = ax.imshow(img,
            cmap= 'seismic' )

Markers 

>>> fig, ax = plt.subplots()
>>> ax.scatter(x,y,marker= ".")
>>> ax.plot(x,y,marker= "o")

Linestyles 

>>> plt.plot(x,y,linewidth=4.0)
>>> plt.plot(x,y,ls= 'solid') 
>>> plt.plot(x,y,ls= '--') 
>>> plt.plot(x,y,'--' ,x**2,y**2,'-.' ) 
>>> plt.setp(lines,color= 'r',linewidth=4.0)

Text & Annotations 

>>> ax.text(1,
           -2.1, 
           'Example Graph', 
            style= 'italic' )
>>> ax.annotate("Sine", 
xy=(8, 0),
xycoords= 'data', 
xytext=(10.5, 0),
textcoords= 'data', 
arrowprops=dict(arrowstyle= "->", 
connectionstyle="arc3"),)

Mathtext 

>>> plt.title(r '$sigma_i=15$', fontsize=20)

Limits, Legends and Layouts 

Limits & Autoscaling 

>>> ax.margins(x=0.0,y=0.1) #Add padding to a plot
>>> ax.axis('equal')  #Set the aspect ratio of the plot to 1
>>> ax.set(xlim=[0,10.5],ylim=[-1.5,1.5])  #Set limits for x-and y-axis
>>> ax.set_xlim(0,10.5) #Set limits for x-axis

Legends 

>>> ax.set(title= 'An Example Axes',  #Set a title and x-and y-axis labels
            ylabel= 'Y-Axis', 
            xlabel= 'X-Axis')
>>> ax.legend(loc= 'best')  #No overlapping plot elements

Ticks 

>>> ax.xaxis.set(ticks=range(1,5),  #Manually set x-ticks
             ticklabels=[3,100, 12,"foo" ])
>>> ax.tick_params(axis= 'y', #Make y-ticks longer and go in and out
             direction= 'inout', 
              length=10)

Subplot Spacing 

>>> fig3.subplots_adjust(wspace=0.5,   #Adjust the spacing between subplots
             hspace=0.3,
             left=0.125,
             right=0.9,
             top=0.9,
             bottom=0.1)
>>> fig.tight_layout() #Fit subplot(s) in to the figure area

Axis Spines 

>>> ax1.spines[ 'top'].set_visible(False) #Make the top axis line for a plot invisible
>>> ax1.spines['bottom' ].set_position(( 'outward',10))  #Move the bottom axis line outward

Have this Cheat Sheet at your fingertips

Original article source at https://www.datacamp.com

#matplotlib #cheatsheet #python

 iOS App Dev

iOS App Dev

1620466520

Your Data Architecture: Simple Best Practices for Your Data Strategy

If you accumulate data on which you base your decision-making as an organization, you should probably think about your data architecture and possible best practices.

If you accumulate data on which you base your decision-making as an organization, you most probably need to think about your data architecture and consider possible best practices. Gaining a competitive edge, remaining customer-centric to the greatest extent possible, and streamlining processes to get on-the-button outcomes can all be traced back to an organization’s capacity to build a future-ready data architecture.

In what follows, we offer a short overview of the overarching capabilities of data architecture. These include user-centricity, elasticity, robustness, and the capacity to ensure the seamless flow of data at all times. Added to these are automation enablement, plus security and data governance considerations. These points from our checklist for what we perceive to be an anticipatory analytics ecosystem.

#big data #data science #big data analytics #data analysis #data architecture #data transformation #data platform #data strategy #cloud data platform #data acquisition

Gerhard  Brink

Gerhard Brink

1620629020

Getting Started With Data Lakes

Frameworks for Efficient Enterprise Analytics

The opportunities big data offers also come with very real challenges that many organizations are facing today. Often, it’s finding the most cost-effective, scalable way to store and process boundless volumes of data in multiple formats that come from a growing number of sources. Then organizations need the analytical capabilities and flexibility to turn this data into insights that can meet their specific business objectives.

This Refcard dives into how a data lake helps tackle these challenges at both ends — from its enhanced architecture that’s designed for efficient data ingestion, storage, and management to its advanced analytics functionality and performance flexibility. You’ll also explore key benefits and common use cases.

Introduction

As technology continues to evolve with new data sources, such as IoT sensors and social media churning out large volumes of data, there has never been a better time to discuss the possibilities and challenges of managing such data for varying analytical insights. In this Refcard, we dig deep into how data lakes solve the problem of storing and processing enormous amounts of data. While doing so, we also explore the benefits of data lakes, their use cases, and how they differ from data warehouses (DWHs).


This is a preview of the Getting Started With Data Lakes Refcard. To read the entire Refcard, please download the PDF from the link above.

#big data #data analytics #data analysis #business analytics #data warehouse #data storage #data lake #data lake architecture #data lake governance #data lake management

Python String Methods Explained with Examples

Python has a set of built-in methods that you can use on strings.

Note: All string methods returns new values. They do not change the original string.

MethodDescription
capitalize()Converts the first character to upper case
casefold()Converts string into lower case
center()Returns a centered string
count()Returns the number of times a specified value occurs in a string
encode()Returns an encoded version of the string
endswith()Returns true if the string ends with the specified value
expandtabs()Sets the tab size of the string
find()Searches the string for a specified value and returns the position of where it was found
format()Formats specified values in a string
format_map()Formats specified values in a string
index()Searches the string for a specified value and returns the position of where it was found
isalnum()Returns True if all characters in the string are alphanumeric
isalpha()Returns True if all characters in the string are in the alphabet
isascii()Returns True if all characters in the string are ascii characters
isdecimal()Returns True if all characters in the string are decimals
isdigit()Returns True if all characters in the string are digits
isidentifier()Returns True if the string is an identifier
islower()Returns True if all characters in the string are lower case
isnumeric()Returns True if all characters in the string are numeric
isprintable()Returns True if all characters in the string are printable
isspace()Returns True if all characters in the string are whitespaces
istitle()Returns True if the string follows the rules of a title
isupper()Returns True if all characters in the string are upper case
join()Converts the elements of an iterable into a string
ljust()Returns a left justified version of the string
lower()Converts a string into lower case
lstrip()Returns a left trim version of the string
maketrans()Returns a translation table to be used in translations
partition()Returns a tuple where the string is parted into three parts
replace()Returns a string where a specified value is replaced with a specified value
rfind()Searches the string for a specified value and returns the last position of where it was found
rindex()Searches the string for a specified value and returns the last position of where it was found
rjust()Returns a right justified version of the string
rpartition()Returns a tuple where the string is parted into three parts
rsplit()Splits the string at the specified separator, and returns a list
rstrip()Returns a right trim version of the string
split()Splits the string at the specified separator, and returns a list
splitlines()Splits the string at line breaks and returns a list
startswith()Returns true if the string starts with the specified value
strip()Returns a trimmed version of the string
swapcase()Swaps cases, lower case becomes upper case and vice versa
title()Converts the first character of each word to upper case
translate()Returns a translated string
upper()Converts a string into upper case
zfill()Fills the string with a specified number of 0 values at the beginning

 


Python String capitalize() Method

Example

Upper case the first letter in this sentence:

txt = "hello, and welcome to my world."

x = txt.capitalize()

print (x)

Definition and Usage

The capitalize() method returns a string where the first character is upper case, and the rest is lower case.

Syntax

string.capitalize()

Parameter Values

No parameters

More Examples

Example

The first character is converted to upper case, and the rest are converted to lower case:

txt = "python is FUN!"

x = txt.capitalize()

print (x)

Example

See what happens if the first character is a number:

txt = "36 is my age."

x = txt.capitalize()

print (x)

Python String casefold() Method

Example

Make the string lower case:

txt = "Hello, And Welcome To My World!"

x = txt.casefold()

print(x)

Definition and Usage

The casefold() method returns a string where all the characters are lower case.

This method is similar to the lower() method, but the casefold() method is stronger, more aggressive, meaning that it will convert more characters into lower case, and will find more matches when comparing two strings and both are converted using the casefold() method.

Syntax

string.casefold()

Parameter Values

No parameters


Python String center() Method

Example

Print the word "banana", taking up the space of 20 characters, with "banana" in the middle:

txt = "banana"

x = txt.center(20)

print(x)

Definition and Usage

The center() method will center align the string, using a specified character (space is default) as the fill character.

Syntax

string.center(length, character)

Parameter Values

ParameterDescription
lengthRequired. The length of the returned string
characterOptional. The character to fill the missing space on each side. Default is " " (space)

More Examples

Example

Using the letter "O" as the padding character:

txt = "banana"

x = txt.center(20, "O")

print(x)

Python String count() Method

Example

Return the number of times the value "apple" appears in the string:

txt = "I love apples, apple are my favorite fruit"

x = txt.count("apple")

print(x)

Definition and Usage

The count() method returns the number of times a specified value appears in the string.

Syntax

string.count(value, start, end)

Parameter Values

ParameterDescription
valueRequired. A String. The string to value to search for
startOptional. An Integer. The position to start the search. Default is 0
endOptional. An Integer. The position to end the search. Default is the end of the string

More Examples

Example

Search from position 10 to 24:

txt = "I love apples, apple are my favorite fruit"

x = txt.count("apple", 10, 24)

print(x

Python String encode() Method

Example

UTF-8 encode the string:

txt = "My name is Ståle"

x = txt.encode()

print(x)

Definition and Usage

The encode() method encodes the string, using the specified encoding. If no encoding is specified, UTF-8 will be used.

Syntax

string.encode(encoding=encoding, errors=errors)

Parameter Values

ParameterDescription
encodingOptional. A String specifying the encoding to use. Default is UTF-8
errors

Optional. A String specifying the error method. Legal values are:
 

'backslashreplace'- uses a backslash instead of the character that could not be encoded
'ignore'- ignores the characters that cannot be encoded
'namereplace'- replaces the character with a text explaining the character
'strict'- Default, raises an error on failure
'replace'- replaces the character with a questionmark
'xmlcharrefreplace'- replaces the character with an xml character

More Examples

Example

These examples uses ascii encoding, and a character that cannot be encoded, showing the result with different errors:

txt = "My name is Ståle"

print(txt.encode(encoding="ascii",errors="backslashreplace"))
print(txt.encode(encoding="ascii",errors="ignore"))
print(txt.encode(encoding="ascii",errors="namereplace"))
print(txt.encode(encoding="ascii",errors="replace"))
print(txt.encode(encoding="ascii",errors="xmlcharrefreplace"))

Python String endswith() Method

Example

Check if the string ends with a punctuation sign (.):

txt = "Hello, welcome to my world."

x = txt.endswith(".")

print(x)

Definition and Usage

The endswith() method returns True if the string ends with the specified value, otherwise False.

Syntax

string.endswith(value, start, end)

Parameter Values

ParameterDescription
valueRequired. The value to check if the string ends with
startOptional. An Integer specifying at which position to start the search
endOptional. An Integer specifying at which position to end the search

More Examples

Example

Check if the string ends with the phrase "my world.":

txt = "Hello, welcome to my world."

x = txt.endswith("my world.")

print(x)

Example

Check if position 5 to 11 ends with the phrase "my world.":

txt = "Hello, welcome to my world."

x = txt.endswith("my world.", 5, 11)

print(x)

Python String expandtabs() Method

Example

Set the tab size to 2 whitespaces:

txt = "H\te\tl\tl\to"

x =  txt.expandtabs(2)

print(x)

Definition and Usage

The expandtabs() method sets the tab size to the specified number of whitespaces.

Syntax

string.expandtabs(tabsize)

Parameter Values

ParameterDescription
tabsizeOptional. A number specifying the tabsize. Default tabsize is 8

More Examples

Example

See the result using different tab sizes:

txt = "H\te\tl\tl\to"

print(txt)
print(txt.expandtabs())
print(txt.expandtabs(2))
print(txt.expandtabs(4))
print(txt.expandtabs(10))

Python String find() Method

Example

Where in the text is the word "welcome"?:

txt = "Hello, welcome to my world."

x = txt.find("welcome")

print(x)

Definition and Usage

The find() method finds the first occurrence of the specified value.

The find() method returns -1 if the value is not found.

The find() method is almost the same as the index() method, the only difference is that the index() method raises an exception if the value is not found. (See example below)

Syntax

string.find(value, start, end)

Parameter Values

ParameterDescription
valueRequired. The value to search for
startOptional. Where to start the search. Default is 0
endOptional. Where to end the search. Default is to the end of the string

More Examples

Example

Where in the text is the first occurrence of the letter "e"?:

txt = "Hello, welcome to my world."

x = txt.find("e")

print(x)

Example

Where in the text is the first occurrence of the letter "e" when you only search between position 5 and 10?:

txt = "Hello, welcome to my world."

x = txt.find("e", 5, 10)

print(x)

Example

If the value is not found, the find() method returns -1, but the index() method will raise an exception:

txt = "Hello, welcome to my world."

print(txt.find("q"))
print(txt.index("q"))

Python String format() Method

Example

Insert the price inside the placeholder, the price should be in fixed point, two-decimal format:

txt = "For only {price:.2f} dollars!"
print(txt.format(price = 49))

Definition and Usage

The format() method formats the specified value(s) and insert them inside the string's placeholder.

The placeholder is defined using curly brackets: {}. Read more about the placeholders in the Placeholder section below.

The format() method returns the formatted string.

Syntax

string.format(value1, value2...)

Parameter Values

ParameterDescription
value1, value2...Required. One or more values that should be formatted and inserted in the string.

The values are either a list of values separated by commas, a key=value list, or a combination of both.

The values can be of any data type.

The Placeholders

The placeholders can be identified using named indexes {price}, numbered indexes {0}, or even empty placeholders {}.

Example

Using different placeholder values:

txt1 = "My name is {fname}, I'm {age}".format(fname = "John", age = 36)
txt2 = "My name is {0}, I'm {1}".format("John",36)
txt3 = "My name is {}, I'm {}".format("John",36)

Formatting Types

Inside the placeholders you can add a formatting type to format the result:

:<

Try it

Left aligns the result (within the available space)
:>

Try it

Right aligns the result (within the available space)
:^

Try it

Center aligns the result (within the available space)
:=

Try it

Places the sign to the left most position
:+

Try it

Use a plus sign to indicate if the result is positive or negative
:-

Try it

Use a minus sign for negative values only

Try it

Use a space to insert an extra space before positive numbers (and a minus sign before negative numbers)
:,

Try it

Use a comma as a thousand separator
:_

Try it

Use a underscore as a thousand separator
:b

Try it

Binary format
:c Converts the value into the corresponding unicode character
:d

Try it

Decimal format
:e

Try it

Scientific format, with a lower case e
:E

Try it

Scientific format, with an upper case E
:f

Try it

Fix point number format
:F

Try it

Fix point number format, in uppercase format (show inf and nan as INF and NAN)
:g General format
:G General format (using a upper case E for scientific notations)
:o

Try it

Octal format
:x

Try it

Hex format, lower case
:X

Try it

Hex format, upper case
:n Number format
:%

Try it

Percentage format

Python String index() Method

Example

Where in the text is the word "welcome"?:

txt = "Hello, welcome to my world."

x = txt.index("welcome")

print(x)

Definition and Usage

The index() method finds the first occurrence of the specified value.

The index() method raises an exception if the value is not found.

The index() method is almost the same as the find() method, the only difference is that the find() method returns -1 if the value is not found. (See example below)

Syntax

string.index(value, start, end)

Parameter Values

ParameterDescription
valueRequired. The value to search for
startOptional. Where to start the search. Default is 0
endOptional. Where to end the search. Default is to the end of the string

More Examples

Example

Where in the text is the first occurrence of the letter "e"?:

txt = "Hello, welcome to my world."

x = txt.index("e")

print(x)

Example

Where in the text is the first occurrence of the letter "e" when you only search between position 5 and 10?:

txt = "Hello, welcome to my world."

x = txt.index("e", 5, 10)

print(x)

Example

If the value is not found, the find() method returns -1, but the index() method will raise an exception:

txt = "Hello, welcome to my world."

print(txt.find("q"))
print(txt.index("q"))

Python String isalnum() Method

Example

Check if all the characters in the text are alphanumeric:

txt = "Company12"

x = txt.isalnum()

print(x)

Definition and Usage

The isalnum() method returns True if all the characters are alphanumeric, meaning alphabet letter (a-z) and numbers (0-9).

Example of characters that are not alphanumeric: (space)!#%&? etc.

Syntax

string.isalnum()

Parameter Values

No parameters.

More Examples

Example

Check if all the characters in the text is alphanumeric:

txt = "Company 12"

x = txt.isalnum()

print(x)

Python String isalpha() Method

Example

Check if all the characters in the text are letters:

txt = "CompanyX"

x = txt.isalpha()

print(x)

Definition and Usage

The isalpha() method returns True if all the characters are alphabet letters (a-z).

Example of characters that are not alphabet letters: (space)!#%&? etc.

Syntax

string.isalpha()

Parameter Values

No parameters.

More Examples

Example

Check if all the characters in the text is alphabetic:

txt = "Company10"

x = txt.isalpha()

print(x)

Python String isascii() Method

Example

Check if all the characters in the text are ascii characters:

txt = "Company123"

x = txt.isascii()

print(x)

Definition and Usage

The isascii() method returns True if all the characters are ascii characters  (a-z).

Check our ASCII Reference.

Syntax

string.isascii()

Parameter Values

No parameters.


Python String isdecimal() Method

Example

Check if all the characters in the unicode object are decimals:

txt = "\u0033" #unicode for 3

x = txt.isdecimal()

print(x)

Definition and Usage

The isdecimal() method returns True if all the characters are decimals (0-9).

This method is used on unicode objects.

Syntax

string.isdecimal()

Parameter Values

No parameters.

More Examples

Example

Check if all the characters in the unicode are decimals:

a = "\u0030" #unicode for 0
b = "\u0047" #unicode for G

print(a.isdecimal())
print(b.isdecimal())

Python String isdigit() Method

Example

Check if all the characters in the text are digits:

txt = "50800"

x = txt.isdigit()

print(x)

Definition and Usage

The isdigit() method returns True if all the characters are digits, otherwise False.

Exponents, like ², are also considered to be a digit.

Syntax

string.isdigit()

Parameter Values

No parameters.

More Examples

Example

Check if all the characters in the text are digits:

a = "\u0030" #unicode for 0
b = "\u00B2" #unicode for ²

print(a.isdigit())
print(b.isdigit())

Python String isidentifier() Method

Example

Check if the string is a valid identifier:

txt = "Demo"

x = txt.isidentifier()

print(x)

Definition and Usage

The isidentifier() method returns True if the string is a valid identifier, otherwise False.

A string is considered a valid identifier if it only contains alphanumeric letters (a-z) and (0-9), or underscores (_). A valid identifier cannot start with a number, or contain any spaces.

Syntax

string.isidentifier()

Parameter Values

No parameters.

More Examples

Example

Check if the strings are valid identifiers:

a = "MyFolder"
b = "Demo002"
c = "2bring"
d = "my demo"

print(a.isidentifier())
print(b.isidentifier())
print(c.isidentifier())
print(d.isidentifier())

Python String islower() Method

Example

Check if all the characters in the text are in lower case:

txt = "hello world!"

x = txt.islower()

print(x)

Definition and Usage

The islower() method returns True if all the characters are in lower case, otherwise False.

Numbers, symbols and spaces are not checked, only alphabet characters.

Syntax

string.islower()

Parameter Values

No parameters.

More Examples

Example

Check if all the characters in the texts are in lower case:

a = "Hello world!"
b = "hello 123"
c = "mynameisPeter"

print(a.islower())
print(b.islower())
print(c.islower())

Python String isnumeric() Method

Example

Check if all the characters in the text are numeric:

txt = "565543"

x = txt.isnumeric()

print(x)

Definition and Usage

The isnumeric() method returns True if all the characters are numeric (0-9), otherwise False.

Exponents, like ² and ¾ are also considered to be numeric values.

"-1" and "1.5" are NOT considered numeric values, because all the characters in the string must be numeric, and the - and the . are not.

Syntax

string.isnumeric()

Parameter Values

No parameters.

More Examples

Example

Check if the characters are numeric:

a = "\u0030" #unicode for 0
b = "\u00B2" #unicode for &sup2;
c = "10km2"
d = "-1"
e = "1.5"

print(a.isnumeric())
print(b.isnumeric())
print(c.isnumeric())
print(d.isnumeric())
print(e.isnumeric())

Python String isprintable() Method

Example

Check if all the characters in the text are printable:

txt = "Hello! Are you #1?"

x = txt.isprintable()

print(x)

Definition and Usage

The isprintable() method returns True if all the characters are printable, otherwise False.

Example of none printable character can be carriage return and line feed.

Syntax

string.isprintable()

Parameter Values

No parameters.

More Examples

Example

Check if all the characters in the text are printable:

txt = "Hello!\nAre you #1?"

x = txt.isprintable()

print(x)

Python String isspace() Method

Example

Check if all the characters in the text are whitespaces:

txt = "   "

x = txt.isspace()

print(x)

Definition and Usage

The isspace() method returns True if all the characters in a string are whitespaces, otherwise False.

Syntax

string.isspace()

Parameter Values

No parameters.

More Examples

Example

Check if all the characters in the text are whitespaces:

txt = "   s   "

x = txt.isspace()

print(x)

Python String istitle() Method

Example

Check if each word start with an upper case letter:

txt = "Hello, And Welcome To My World!"

x = txt.istitle()

print(x)

Definition and Usage

The istitle() method returns True if all words in a text start with a upper case letter, AND the rest of the word are lower case letters, otherwise False.

Symbols and numbers are ignored.

Syntax

string.istitle()

Parameter Values

No parameters.

More Examples

Example

Check if each word start with an upper case letter:

a = "HELLO, AND WELCOME TO MY WORLD"
b = "Hello"
c = "22 Names"
d = "This Is %'!?"

print(a.istitle())
print(b.istitle())
print(c.istitle())
print(d.istitle())

Python String isupper() Method

Example

Check if all the characters in the text are in upper case:

txt = "THIS IS NOW!"

x = txt.isupper()

print(x)

Definition and Usage

The isupper() method returns True if all the characters are in upper case, otherwise False.

Numbers, symbols and spaces are not checked, only alphabet characters.

Syntax

string.isupper()

Parameter Values

No parameters.

More Examples

Example

Check if all the characters in the texts are in upper case:

a = "Hello World!"
b = "hello 123"
c = "MY NAME IS PETER"

print(a.isupper())
print(b.isupper())
print(c.isupper())

Python String join() Method

Example

Join all items in a tuple into a string, using a hash character as separator:

myTuple = ("John", "Peter", "Vicky")

x = "#".join(myTuple)

print(x)

Definition and Usage

The join() method takes all items in an iterable and joins them into one string.

A string must be specified as the separator.

Syntax

string.join(iterable)

Parameter Values

ParameterDescription
iterableRequired. Any iterable object where all the returned values are strings

More Examples

Example

Join all items in a dictionary into a string, using the word "TEST" as separator:

myDict = {"name": "John", "country": "Norway"}
mySeparator = "TEST"

x = mySeparator.join(myDict)

print(x)

Python String ljust() Method

Example

Return a 20 characters long, left justified version of the word "banana":

txt = "banana"

x = txt.ljust(20)

print(x, "is my favorite fruit.")

Note: In the result, there are actually 14 whitespaces to the right of the word banana.

Definition and Usage

The ljust() method will left align the string, using a specified character (space is default) as the fill character.

Syntax

string.ljust(length, character)

Parameter Values

ParameterDescription
lengthRequired. The length of the returned string
characterOptional. A character to fill the missing space (to the right of the string). Default is " " (space).

More Examples

Example

Using the letter "O" as the padding character:

txt = "banana"

x = txt.ljust(20, "O")

print(x)

Python String lower() Method

Example

Lower case the string:

txt = "Hello my FRIENDS"

x = txt.lower()

print(x)

Definition and Usage

The lower() method returns a string where all characters are lower case.

 Symbols and Numbers are ignored.

Syntax

string.lower()

Parameter Values

No parameters


Python String lstrip() Method

Example

Remove spaces to the left of the string:

txt = "     banana     "

x = txt.lstrip()

print("of all fruits", x, "is my favorite")

Definition and Usage

The lstrip() method removes any leading characters (space is the default leading character to remove)

Syntax

string.lstrip(characters)

Parameter Values

ParameterDescription
charactersOptional. A set of characters to remove as leading characters

More Examples

Example

Remove the leading characters:

txt = ",,,,,ssaaww.....banana"

x = txt.lstrip(",.asw")

print(x)

Python String maketrans() Method

Example

Create a mapping table, and use it in the translate() method to replace any "S" characters with a "P" character:

txt = "Hello Sam!"
mytable = txt.maketrans("S", "P")
print(txt.translate(mytable))

Definition and Usage

The maketrans() method returns a mapping table that can be used with the translate() method to replace specified characters.

Syntax

string.maketrans(x, y, z)

Parameter Values

ParameterDescription
xRequired. If only one parameter is specified, this has to be a dictionary describing how to perform the replace. If two or more parameters are specified, this parameter has to be a string specifying the characters you want to replace.
yOptional. A string with the same length as parameter x. Each character in the first parameter will be replaced with the corresponding character in this string.
zOptional. A string describing which characters to remove from the original string.

More Examples

Example

Use a mapping table to replace many characters:

txt = "Hi Sam!"
x = "mSa"
y = "eJo"
mytable = txt.maketrans(x, y)
print(txt.translate(mytable))

Example

The third parameter in the mapping table describes characters that you want to remove from the string:

txt = "Good night Sam!"
x = "mSa"
y = "eJo"
z = "odnght"
mytable = txt.maketrans(x, y, z)
print(txt.translate(mytable))

Example

The maketrans() method itself returns a dictionary describing each replacement, in unicode:

txt = "Good night Sam!"
x = "mSa"
y = "eJo"
z = "odnght"
print(txt.maketrans(x, y, z))

Python String partition() Method

Example

Search for the word "bananas", and return a tuple with three elements:

1 - everything before the "match"
2 - the "match"
3 - everything after the "match"

txt = "I could eat bananas all day"

x = txt.partition("bananas")

print(x)

Definition and Usage

The partition() method searches for a specified string, and splits the string into a tuple containing three elements.

The first element contains the part before the specified string.

The second element contains the specified string.

The third element contains the part after the string.

Note: This method searches for the first occurrence of the specified string.

Syntax

string.partition(value)

Parameter Values

ParameterDescription
valueRequired. The string to search for

More Examples

Example

If the specified value is not found, the partition() method returns a tuple containing: 1 - the whole string, 2 - an empty string, 3 - an empty string:

txt = "I could eat bananas all day"

x = txt.partition("apples")

print(x)

Python String replace() Method

Example

Replace the word "bananas":

txt = "I like bananas"

x = txt.replace("bananas", "apples")

print(x)

Definition and Usage

The replace() method replaces a specified phrase with another specified phrase.

Note: All occurrences of the specified phrase will be replaced, if nothing else is specified.

Syntax

string.replace(oldvalue, newvalue, count)

Parameter Values

ParameterDescription
oldvalueRequired. The string to search for
newvalueRequired. The string to replace the old value with
countOptional. A number specifying how many occurrences of the old value you want to replace. Default is all occurrences

More Examples

Example

Replace all occurrence of the word "one":

txt = "one one was a race horse, two two was one too."

x = txt.replace("one", "three")

print(x)

Example

Replace the two first occurrence of the word "one":

txt = "one one was a race horse, two two was one too."

x = txt.replace("one", "three", 2)

print(x)

Python String rfind() Method

Example

Where in the text is the last occurrence of the string "casa"?:

txt = "Mi casa, su casa."

x = txt.rfind("casa")

print(x)

Definition and Usage

The rfind() method finds the last occurrence of the specified value.

The rfind() method returns -1 if the value is not found.

The rfind() method is almost the same as the rindex() method. See example below.

Syntax

string.rfind(value, start, end)

Parameter Values

ParameterDescription
valueRequired. The value to search for
startOptional. Where to start the search. Default is 0
endOptional. Where to end the search. Default is to the end of the string

More Examples

Example

Where in the text is the last occurrence of the letter "e"?:

txt = "Hello, welcome to my world."

x = txt.rfind("e")

print(x)

Example

Where in the text is the last occurrence of the letter "e" when you only search between position 5 and 10?:

txt = "Hello, welcome to my world."

x = txt.rfind("e", 5, 10)

print(x)

Example

If the value is not found, the rfind() method returns -1, but the rindex() method will raise an exception:

txt = "Hello, welcome to my world."

print(txt.rfind("q"))
print(txt.rindex("q"))

Python String rindex() Method

Example

Where in the text is the last occurrence of the string "casa"?:

txt = "Mi casa, su casa."

x = txt.rindex("casa")

print(x)

Definition and Usage

The rindex() method finds the last occurrence of the specified value.

The rindex() method raises an exception if the value is not found.

The rindex() method is almost the same as the rfind() method. See example below.

Syntax

string.rindex(value, start, end)

Parameter Values

ParameterDescription
valueRequired. The value to search for
startOptional. Where to start the search. Default is 0
endOptional. Where to end the search. Default is to the end of the string

More Examples

Example

Where in the text is the last occurrence of the letter "e"?:

txt = "Hello, welcome to my world."

x = txt.rindex("e")

print(x)

Example

Where in the text is the last occurrence of the letter "e" when you only search between position 5 and 10?:

txt = "Hello, welcome to my world."

x = txt.rindex("e", 5, 10)

print(x)

Example

If the value is not found, the rfind() method returns -1, but the rindex() method will raise an exception:

txt = "Hello, welcome to my world."

print(txt.rfind("q"))
print(txt.rindex("q"))

Python String rjust() Method

Example

Return a 20 characters long, right justified version of the word "banana":

txt = "banana"

x = txt.rjust(20)

print(x, "is my favorite fruit.")

Note: In the result, there are actually 14 whitespaces to the left of the word banana.

Definition and Usage

The rjust() method will right align the string, using a specified character (space is default) as the fill character.

Syntax

string.rjust(length, character)

Parameter Values

ParameterDescription
lengthRequired. The length of the returned string
characterOptional. A character to fill the missing space (to the left of the string). Default is " " (space).

More Examples

Example

Using the letter "O" as the padding character:

txt = "banana"

x = txt.rjust(20, "O")

print(x)

Python String rpartition() Method

Example

Search for the last occurrence of the word "bananas", and return a tuple with three elements:

1 - everything before the "match"
2 - the "match"
3 - everything after the "match"

txt = "I could eat bananas all day, bananas are my favorite fruit"

x = txt.rpartition("bananas")

print(x)

Definition and Usage

The rpartition() method searches for the last occurrence of a specified string, and splits the string into a tuple containing three elements.

The first element contains the part before the specified string.

The second element contains the specified string.

The third element contains the part after the string.

Syntax

string.rpartition(value)

Parameter Values

ParameterDescription
valueRequired. The string to search for

More Examples

Example

If the specified value is not found, the rpartition() method returns a tuple containing: 1 - an empty string, 2 - an empty string, 3 - the whole string:

txt = "I could eat bananas all day, bananas are my favorite fruit"

x = txt.rpartition("apples")

print(x)

Python String rsplit() Method

Example

Split a string into a list, using comma, followed by a space (, ) as the separator:

txt = "apple, banana, cherry"

x = txt.rsplit(", ")

print(x)

Definition and Usage

The rsplit() method splits a string into a list, starting from the right.

If no "max" is specified, this method will return the same as the split() method.

Note: When maxsplit is specified, the list will contain the specified number of elements plus one.

Syntax

string.rsplit(separator, maxsplit)

Parameter Values

ParameterDescription
separatorOptional. Specifies the separator to use when splitting the string. By default any whitespace is a separator
maxsplitOptional. Specifies how many splits to do. Default value is -1, which is "all occurrences"

More Examples

Example

Split the string into a list with maximum 2 items:

txt = "apple, banana, cherry"

# setting the maxsplit parameter to 1, will return a list with 2 elements!
x = txt.rsplit(", ", 1)

print(x)

Python String rstrip() Method

Example

Remove any white spaces at the end of the string:

txt = "     banana     "

x = txt.rstrip()

print("of all fruits", x, "is my favorite")

Definition and Usage

The rstrip() method removes any trailing characters (characters at the end a string), space is the default trailing character to remove.

Syntax

string.rstrip(characters)

Parameter Values

ParameterDescription
charactersOptional. A set of characters to remove as trailing characters

More Examples

Example

Remove the trailing characters if they are commas, s, q, or w:

txt = "banana,,,,,ssqqqww....."

x = txt.rstrip(",.qsw")

print(x)

Python String split() Method

Example

Split a string into a list where each word is a list item:

txt = "welcome to the jungle"

x = txt.split()

print(x)

Definition and Usage

The split() method splits a string into a list.

You can specify the separator, default separator is any whitespace.

Note: When maxsplit is specified, the list will contain the specified number of elements plus one.

Syntax

string.split(separator, maxsplit)

Parameter Values

ParameterDescription
separatorOptional. Specifies the separator to use when splitting the string. By default any whitespace is a separator
maxsplitOptional. Specifies how many splits to do. Default value is -1, which is "all occurrences"

More Examples

Example

Split the string, using comma, followed by a space, as a separator:

txt = "hello, my name is Peter, I am 26 years old"

x = txt.split(", ")

print(x)

Example

Use a hash character as a separator:

txt = "apple#banana#cherry#orange"

x = txt.split("#")

print(x)

Example

Split the string into a list with max 2 items:

txt = "apple#banana#cherry#orange"

# setting the maxsplit parameter to 1, will return a list with 2 elements!
x = txt.split("#", 1)

print(x)

Python String splitlines() Method

Example

Split a string into a list where each line is a list item:

txt = "Thank you for the music\nWelcome to the jungle"

x = txt.splitlines()

print(x)

Definition and Usage

The splitlines() method splits a string into a list. The splitting is done at line breaks.

Syntax

string.splitlines(keeplinebreaks)

Parameter Values

ParameterDescription
keeplinebreaksOptional. Specifies if the line breaks should be included (True), or not (False). Default value is False

More Examples

Example

Split the string, but keep the line breaks:

txt = "Thank you for the music\nWelcome to the jungle"

x = txt.splitlines(True)

print(x)

Python String startswith() Method

Example

Check if the string starts with "Hello":

txt = "Hello, welcome to my world."

x = txt.startswith("Hello")

print(x)

Definition and Usage

The startswith() method returns True if the string starts with the specified value, otherwise False.

Syntax

string.startswith(value, start, end)

Parameter Values

ParameterDescription
valueRequired. The value to check if the string starts with
startOptional. An Integer specifying at which position to start the search
endOptional. An Integer specifying at which position to end the search

More Examples

Example

Check if position 7 to 20 starts with the characters "wel":

txt = "Hello, welcome to my world."

x = txt.startswith("wel", 7, 20)

print(x)

Python String strip() Method

Example

Remove spaces at the beginning and at the end of the string:

txt = "     banana     "

x = txt.strip()

print("of all fruits", x, "is my favorite")

Definition and Usage

The strip() method removes any leading (spaces at the beginning) and trailing (spaces at the end) characters (space is the default leading character to remove)

Syntax

string.strip(characters)

Parameter Values

ParameterDescription
charactersOptional. A set of characters to remove as leading/trailing characters

More Examples

Example

Remove the leading and trailing characters:

txt = ",,,,,rrttgg.....banana....rrr"

x = txt.strip(",.grt")

print(x)

Python String swapcase() Method

Example

Make the lower case letters upper case and the upper case letters lower case:

txt = "Hello My Name Is PETER"

x = txt.swapcase()

print(x)

Definition and Usage

The swapcase() method returns a string where all the upper case letters are lower case and vice versa.

Syntax

string.swapcase()

Parameter Values

No parameters.


Python String title() Method

Example

Make the first letter in each word upper case:

txt = "Welcome to my world"

x = txt.title()

print(x)

Definition and Usage

The title() method returns a string where the first character in every word is upper case. Like a header, or a title.

If the word contains a number or a symbol, the first letter after that will be converted to upper case.

Syntax

string.title()

Parameter Values

No parameters.

More Examples

Example

Make the first letter in each word upper case:

txt = "Welcome to my 2nd world"

x = txt.title()

print(x)

Example

Note that the first letter after a non-alphabet letter is converted into a upper case letter:

txt = "hello b2b2b2 and 3g3g3g"

x = txt.title()

print(x)

Python String translate() Method

Example

Replace any "S" characters with a "P" character:

#use a dictionary with ascii codes to replace 83 (S) with 80 (P):
mydict = {83:  80}
txt = "Hello Sam!"
print(txt.translate(mydict))

Definition and Usage

The translate() method returns a string where some specified characters are replaced with the character described in a dictionary, or in a mapping table.

Use the maketrans() method to create a mapping table.

If a character is not specified in the dictionary/table, the character will not be replaced.

If you use a dictionary, you must use ascii codes instead of characters.

Syntax

string.translate(table)

Parameter Values

ParameterDescription
tableRequired. Either a dictionary, or a mapping table describing how to perform the replace

More Examples

Example

Use a mapping table to replace "S" with "P":

txt = "Hello Sam!"
mytable = txt.maketrans("S", "P")
print(txt.translate(mytable))

Example

Use a mapping table to replace many characters:

txt = "Hi Sam!"
x = "mSa"
y = "eJo"
mytable = txt.maketrans(x, y)
print(txt.translate(mytable))

Example

The third parameter in the mapping table describes characters that you want to remove from the string:

txt = "Good night Sam!"
x = "mSa"
y = "eJo"
z = "odnght"
mytable = txt.maketrans(x, y, z)
print(txt.translate(mytable))

Example

The same example as above, but using a dictionary instead of a mapping table:

txt = "Good night Sam!"
mydict = {109: 101, 83: 74, 97: 111, 111: None, 100: None, 110: None, 103: None, 104: None, 116: None}
print(txt.translate(mydict))

Python String upper() Method

Example

Upper case the string:

txt = "Hello my friends"

x = txt.upper()

print(x)

Definition and Usage

The upper() method returns a string where all characters are in upper case.

 Symbols and Numbers are ignored.

Syntax

string.upper()

Parameter Values

No parameters


Python String zfill() Method

Example

Fill the string with zeros until it is 10 characters long:

txt = "50"

x = txt.zfill(10)

print(x)

Definition and Usage

The zfill() method adds zeros (0) at the beginning of the string, until it reaches the specified length.

If the value of the len parameter is less than the length of the string, no filling is done.

Syntax

string.zfill(len)

Parameter Values

ParameterDescription
lenRequired. A number specifying the desired length of the string

More Examples

Example

Fill the strings with zeros until they are 10 characters long:

a = "hello"
b = "welcome to the jungle"
c = "10.000"

print(a.zfill(10))
print(b.zfill(10))
print(c.zfill(10))

#python #programming #developer