1659733560
UnicodePlot provides the feature to make charts with Unicode characters.
$ gem install unicode_plot
require 'unicode_plot'
x = 0.step(3*Math::PI, by: 3*Math::PI / 30)
y_sin = x.map {|xi| Math.sin(xi) }
y_cos = x.map {|xi| Math.cos(xi) }
plot = UnicodePlot.lineplot(x, y_sin, name: "sin(x)", width: 40, height: 10)
UnicodePlot.lineplot!(plot, x, y_cos, name: "cos(x)")
plot.render
You can get the results below by running the above script:
UnicodePlot.barplot(data: {'foo': 20, 'bar': 50}, title: "Bar").render
UnicodePlot.boxplot(data: {foo: [1, 3, 5], bar: [3, 5, 7]}, title: "Box").render
x = Array.new(500) { 20*rand - 10 } + Array.new(500) { 6*rand - 3 }
y = Array.new(1000) { 30*rand - 10 }
UnicodePlot.densityplot(x, y, title: "Density").render
x = Array.new(100) { rand(10) } + Array.new(100) { rand(30) + 10 }
UnicodePlot.histogram(x, title: "Histogram").render
See Usage section above.
x = Array.new(50) { rand(20) - 10 }
y = x.map {|xx| xx*rand(30) - 10 }
UnicodePlot.scatterplot(x, y, title: "Scatter").render
This library is strongly inspired by UnicodePlot.jl.
https://red-data-tools.github.io/unicode_plot.rb/
MIT License
Author: red-data-tools
Source code: https://github.com/red-data-tools/unicode_plot.rb
License: MIT license
1652748716
Exploratory data analysis is used by data scientists to analyze and investigate data sets and summarize their main characteristics, often employing data visualization methods. It helps determine how best to manipulate data sources to get the answers you need, making it easier for data scientists to discover patterns, spot anomalies, test a hypothesis, or check assumptions. EDA is primarily used to see what data can reveal beyond the formal modeling or hypothesis testing task and provides a better understanding of data set variables and the relationships between them. It can also help determine if the statistical techniques you are considering for data analysis are appropriate or not.
🔹 Topics Covered:
00:00:00 Basics of EDA with Python
01:40:10 Multiple Variate Analysis
02:30:26 Outlier Detection
03:44:48 Cricket World Cup Analysis using Exploratory Data Analysis
If we want to explain EDA in simple terms, it means trying to understand the given data much better, so that we can make some sense out of it.
We can find a more formal definition in Wikipedia.
In statistics, exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task.
EDA in Python uses data visualization to draw meaningful patterns and insights. It also involves the preparation of data sets for analysis by removing irregularities in the data.
Based on the results of EDA, companies also make business decisions, which can have repercussions later.
In this article we’ll see about the following topics:
Data Sourcing is the process of finding and loading the data into our system. Broadly there are two ways in which we can find data.
Private Data
As the name suggests, private data is given by private organizations. There are some security and privacy concerns attached to it. This type of data is used for mainly organizations internal analysis.
Public Data
This type of Data is available to everyone. We can find this in government websites and public organizations etc. Anyone can access this data, we do not need any special permissions or approval.
We can get public data on the following sites.
The very first step of EDA is Data Sourcing, we have seen how we can access data and load into our system. Now, the next step is how to clean the data.
After completing the Data Sourcing, the next step in the process of EDA is Data Cleaning. It is very important to get rid of the irregularities and clean the data after sourcing it into our system.
Irregularities are of different types of data.
To perform the data cleaning we are using a sample data set, which can be found here.
We are using Jupyter Notebook for analysis.
First, let’s import the necessary libraries and store the data in our system for analysis.
#import the useful libraries.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
# Read the data set of "Marketing Analysis" in data.
data= pd.read_csv("marketing_analysis.csv")
# Printing the data
data
Now, the data set looks like this,
If we observe the above dataset, there are some discrepancies in the Column header for the first 2 rows. The correct data is from the index number 1. So, we have to fix the first two rows.
This is called Fixing the Rows and Columns. Let’s ignore the first two rows and load the data again.
#import the useful libraries.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
# Read the file in data without first two rows as it is of no use.
data = pd.read_csv("marketing_analysis.csv",skiprows = 2)
#print the head of the data frame.
data.head()
Now, the dataset looks like this, and it makes more sense.
Dataset after fixing the rows and columns
Following are the steps to be taken while Fixing Rows and Columns:
Now if we observe the above dataset, the customerid
column has of no importance to our analysis, and also the jobedu
column has both the information of job
and education
in it.
So, what we’ll do is, we’ll drop the customerid
column and we’ll split the jobedu
column into two other columns job
and education
and after that, we’ll drop the jobedu
column as well.
# Drop the customer id as it is of no use.
data.drop('customerid', axis = 1, inplace = True)
#Extract job & Education in newly from "jobedu" column.
data['job']= data["jobedu"].apply(lambda x: x.split(",")[0])
data['education']= data["jobedu"].apply(lambda x: x.split(",")[1])
# Drop the "jobedu" column from the dataframe.
data.drop('jobedu', axis = 1, inplace = True)
# Printing the Dataset
data
Now, the dataset looks like this,
Dropping Customerid
and jobedu columns and adding job and education columns
Missing Values
If there are missing values in the Dataset before doing any statistical analysis, we need to handle those missing values.
There are mainly three types of missing values.
Let’s see which columns have missing values in the dataset.
# Checking the missing values
data.isnull().sum()
The output will be,
As we can see three columns contain missing values. Let’s see how to handle the missing values. We can handle missing values by dropping the missing records or by imputing the values.
Drop the missing Values
Let’s handle missing values in the age
column.
# Dropping the records with age missing in data dataframe.
data = data[~data.age.isnull()].copy()
# Checking the missing values in the dataset.
data.isnull().sum()
Let’s check the missing values in the dataset now.
Let’s impute values to the missing values for the month column.
Since the month column is of an object type, let’s calculate the mode of that column and impute those values to the missing values.
# Find the mode of month in data
month_mode = data.month.mode()[0]
# Fill the missing values with mode value of month in data.
data.month.fillna(month_mode, inplace = True)
# Let's see the null values in the month column.
data.month.isnull().sum()
Now output is,
# Mode of month is
'may, 2017'
# Null values in month column after imputing with mode
0
Handling the missing values in the Response column. Since, our target column is Response Column, if we impute the values to this column it’ll affect our analysis. So, it is better to drop the missing values from Response Column.
#drop the records with response missing in data.
data = data[~data.response.isnull()].copy()
# Calculate the missing values in each column of data frame
data.isnull().sum()
Let’s check whether the missing values in the dataset have been handled or not,
All the missing values have been handled
We can also, fill the missing values as ‘NaN’ so that while doing any statistical analysis, it won’t affect the outcome.
Handling Outliers
We have seen how to fix missing values, now let’s see how to handle outliers in the dataset.
Outliers are the values that are far beyond the next nearest data points.
There are two types of outliers:
So, after understanding the causes of these outliers, we can handle them by dropping those records or imputing with the values or leaving them as is, if it makes more sense.
Standardizing Values
To perform data analysis on a set of values, we have to make sure the values in the same column should be on the same scale. For example, if the data contains the values of the top speed of different companies’ cars, then the whole column should be either in meters/sec scale or miles/sec scale.
Now, that we are clear on how to source and clean the data, let’s see how we can analyze the data.
If we analyze data over a single variable/column from a dataset, it is known as Univariate Analysis.
Categorical Unordered Univariate Analysis:
An unordered variable is a categorical variable that has no defined order. If we take our data as an example, the job column in the dataset is divided into many sub-categories like technician, blue-collar, services, management, etc. There is no weight or measure given to any value in the ‘job’ column.
Now, let’s analyze the job category by using plots. Since Job is a category, we will plot the bar plot.
# Let's calculate the percentage of each job status category.
data.job.value_counts(normalize=True)
#plot the bar graph of percentage job categories
data.job.value_counts(normalize=True).plot.barh()
plt.show()
The output looks like this,
By the above bar plot, we can infer that the data set contains more number of blue-collar workers compared to other categories.
Categorical Ordered Univariate Analysis:
Ordered variables are those variables that have a natural rank of order. Some examples of categorical ordered variables from our dataset are:
Now, let’s analyze the Education Variable from the dataset. Since we’ve already seen a bar plot, let’s see how a Pie Chart looks like.
#calculate the percentage of each education category.
data.education.value_counts(normalize=True)
#plot the pie chart of education categories
data.education.value_counts(normalize=True).plot.pie()
plt.show()
The output will be,
By the above analysis, we can infer that the data set has a large number of them belongs to secondary education after that tertiary and next primary. Also, a very small percentage of them have been unknown.
This is how we analyze univariate categorical analysis. If the column or variable is of numerical then we’ll analyze by calculating its mean, median, std, etc. We can get those values by using the describe function.
data.salary.describe()
The output will be,
If we analyze data by taking two variables/columns into consideration from a dataset, it is known as Bivariate Analysis.
a) Numeric-Numeric Analysis:
Analyzing the two numeric variables from a dataset is known as numeric-numeric analysis. We can analyze it in three different ways.
Scatter Plot
Let’s take three columns ‘Balance’, ‘Age’ and ‘Salary’ from our dataset and see what we can infer by plotting to scatter plot between salary
balance
and age
balance
#plot the scatter plot of balance and salary variable in data
plt.scatter(data.salary,data.balance)
plt.show()
#plot the scatter plot of balance and age variable in data
data.plot.scatter(x="age",y="balance")
plt.show()
Now, the scatter plots looks like,
Pair Plot
Now, let’s plot Pair Plots for the three columns we used in plotting Scatter plots. We’ll use the seaborn library for plotting Pair Plots.
#plot the pair plot of salary, balance and age in data dataframe.
sns.pairplot(data = data, vars=['salary','balance','age'])
plt.show()
The Pair Plot looks like this,
Correlation Matrix
Since we cannot use more than two variables as x-axis and y-axis in Scatter and Pair Plots, it is difficult to see the relation between three numerical variables in a single graph. In those cases, we’ll use the correlation matrix.
# Creating a matrix using age, salry, balance as rows and columns
data[['age','salary','balance']].corr()
#plot the correlation matrix of salary, balance and age in data dataframe.
sns.heatmap(data[['age','salary','balance']].corr(), annot=True, cmap = 'Reds')
plt.show()
First, we created a matrix using age, salary, and balance. After that, we are plotting the heatmap using the seaborn library of the matrix.
b) Numeric - Categorical Analysis
Analyzing the one numeric variable and one categorical variable from a dataset is known as numeric-categorical analysis. We analyze them mainly using mean, median, and box plots.
Let’s take salary
and response
columns from our dataset.
First check for mean value using groupby
#groupby the response to find the mean of the salary with response no & yes separately.
data.groupby('response')['salary'].mean()
The output will be,
There is not much of a difference between the yes and no response based on the salary.
Let’s calculate the median,
#groupby the response to find the median of the salary with response no & yes separately.
data.groupby('response')['salary'].median()
The output will be,
By both mean and median we can say that the response of yes and no remains the same irrespective of the person’s salary. But, is it truly behaving like that, let’s plot the box plot for them and check the behavior.
#plot the box plot of salary for yes & no responses.
sns.boxplot(data.response, data.salary)
plt.show()
The box plot looks like this,
As we can see, when we plot the Box Plot, it paints a very different picture compared to mean and median. The IQR for customers who gave a positive response is on the higher salary side.
This is how we analyze Numeric-Categorical variables, we use mean, median, and Box Plots to draw some sort of conclusions.
c) Categorical — Categorical Analysis
Since our target variable/column is the Response rate, we’ll see how the different categories like Education, Marital Status, etc., are associated with the Response column. So instead of ‘Yes’ and ‘No’ we will convert them into ‘1’ and ‘0’, by doing that we’ll get the “Response Rate”.
#create response_rate of numerical data type where response "yes"= 1, "no"= 0
data['response_rate'] = np.where(data.response=='yes',1,0)
data.response_rate.value_counts()
The output looks like this,
Let’s see how the response rate varies for different categories in marital status.
#plot the bar graph of marital status with average value of response_rate
data.groupby('marital')['response_rate'].mean().plot.bar()
plt.show()
The graph looks like this,
By the above graph, we can infer that the positive response is more for Single status members in the data set. Similarly, we can plot the graphs for Loan vs Response rate, Housing Loans vs Response rate, etc.
If we analyze data by taking more than two variables/columns into consideration from a dataset, it is known as Multivariate Analysis.
Let’s see how ‘Education’, ‘Marital’, and ‘Response_rate’ vary with each other.
First, we’ll create a pivot table with the three columns and after that, we’ll create a heatmap.
result = pd.pivot_table(data=data, index='education', columns='marital',values='response_rate')
print(result)
#create heat map of education vs marital vs response_rate
sns.heatmap(result, annot=True, cmap = 'RdYlGn', center=0.117)
plt.show()
The Pivot table and heatmap looks like this,
Based on the Heatmap we can infer that the married people with primary education are less likely to respond positively for the survey and single people with tertiary education are most likely to respond positively to the survey.
Similarly, we can plot the graphs for Job vs marital vs response, Education vs poutcome vs response, etc.
Conclusion
This is how we’ll do Exploratory Data Analysis. Exploratory Data Analysis (EDA) helps us to look beyond the data. The more we explore the data, the more the insights we draw from it. As a data analyst, almost 80% of our time will be spent understanding data and solving various business problems through EDA.
Thank you for reading and Happy Coding!!!
#dataanalysis #python
1561523460
This Matplotlib cheat sheet introduces you to the basics that you need to plot your data with Python and includes code samples.
Data visualization and storytelling with your data are essential skills that every data scientist needs to communicate insights gained from analyses effectively to any audience out there.
For most beginners, the first package that they use to get in touch with data visualization and storytelling is, naturally, Matplotlib: it is a Python 2D plotting library that enables users to make publication-quality figures. But, what might be even more convincing is the fact that other packages, such as Pandas, intend to build more plotting integration with Matplotlib as time goes on.
However, what might slow down beginners is the fact that this package is pretty extensive. There is so much that you can do with it and it might be hard to still keep a structure when you're learning how to work with Matplotlib.
DataCamp has created a Matplotlib cheat sheet for those who might already know how to use the package to their advantage to make beautiful plots in Python, but that still want to keep a one-page reference handy. Of course, for those who don't know how to work with Matplotlib, this might be the extra push be convinced and to finally get started with data visualization in Python.
You'll see that this cheat sheet presents you with the six basic steps that you can go through to make beautiful plots.
Check out the infographic by clicking on the button below:
With this handy reference, you'll familiarize yourself in no time with the basics of Matplotlib: you'll learn how you can prepare your data, create a new plot, use some basic plotting routines to your advantage, add customizations to your plots, and save, show and close the plots that you make.
What might have looked difficult before will definitely be more clear once you start using this cheat sheet! Use it in combination with the Matplotlib Gallery, the documentation.
Matplotlib
Matplotlib is a Python 2D plotting library which produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms.
>>> import numpy as np
>>> x = np.linspace(0, 10, 100)
>>> y = np.cos(x)
>>> z = np.sin(x)
>>> data = 2 * np.random.random((10, 10))
>>> data2 = 3 * np.random.random((10, 10))
>>> Y, X = np.mgrid[-3:3:100j, -3:3:100j]
>>> U = 1 X** 2 + Y
>>> V = 1 + X Y**2
>>> from matplotlib.cbook import get_sample_data
>>> img = np.load(get_sample_data('axes_grid/bivariate_normal.npy'))
>>> import matplotlib.pyplot as plt
>>> fig = plt.figure()
>>> fig2 = plt.figure(figsize=plt.figaspect(2.0))
>>> fig.add_axes()
>>> ax1 = fig.add_subplot(221) #row-col-num
>>> ax3 = fig.add_subplot(212)
>>> fig3, axes = plt.subplots(nrows=2,ncols=2)
>>> fig4, axes2 = plt.subplots(ncols=3)
>>> plt.savefig('foo.png') #Save figures
>>> plt.savefig('foo.png', transparent=True) #Save transparent figures
>>> plt.show()
>>> fig, ax = plt.subplots()
>>> lines = ax.plot(x,y) #Draw points with lines or markers connecting them
>>> ax.scatter(x,y) #Draw unconnected points, scaled or colored
>>> axes[0,0].bar([1,2,3],[3,4,5]) #Plot vertical rectangles (constant width)
>>> axes[1,0].barh([0.5,1,2.5],[0,1,2]) #Plot horiontal rectangles (constant height)
>>> axes[1,1].axhline(0.45) #Draw a horizontal line across axes
>>> axes[0,1].axvline(0.65) #Draw a vertical line across axes
>>> ax.fill(x,y,color='blue') #Draw filled polygons
>>> ax.fill_between(x,y,color='yellow') #Fill between y values and 0
>>> fig, ax = plt.subplots()
>>> im = ax.imshow(img, #Colormapped or RGB arrays
cmap= 'gist_earth',
interpolation= 'nearest',
vmin=-2,
vmax=2)
>>> axes2[0].pcolor(data2) #Pseudocolor plot of 2D array
>>> axes2[0].pcolormesh(data) #Pseudocolor plot of 2D array
>>> CS = plt.contour(Y,X,U) #Plot contours
>>> axes2[2].contourf(data1) #Plot filled contours
>>> axes2[2]= ax.clabel(CS) #Label a contour plot
>>> axes[0,1].arrow(0,0,0.5,0.5) #Add an arrow to the axes
>>> axes[1,1].quiver(y,z) #Plot a 2D field of arrows
>>> axes[0,1].streamplot(X,Y,U,V) #Plot a 2D field of arrows
>>> ax1.hist(y) #Plot a histogram
>>> ax3.boxplot(y) #Make a box and whisker plot
>>> ax3.violinplot(z) #Make a violin plot
y-axis
x-axis
The basic steps to creating plots with matplotlib are:
1 Prepare Data
2 Create Plot
3 Plot
4 Customized Plot
5 Save Plot
6 Show Plot
>>> import matplotlib.pyplot as plt
>>> x = [1,2,3,4] #Step 1
>>> y = [10,20,25,30]
>>> fig = plt.figure() #Step 2
>>> ax = fig.add_subplot(111) #Step 3
>>> ax.plot(x, y, color= 'lightblue', linewidth=3) #Step 3, 4
>>> ax.scatter([2,4,6],
[5,15,25],
color= 'darkgreen',
marker= '^' )
>>> ax.set_xlim(1, 6.5)
>>> plt.savefig('foo.png' ) #Step 5
>>> plt.show() #Step 6
>>> plt.cla() #Clear an axis
>>> plt.clf(). #Clear the entire figure
>>> plt.close(). #Close a window
>>> plt.plot(x, x, x, x**2, x, x** 3)
>>> ax.plot(x, y, alpha = 0.4)
>>> ax.plot(x, y, c= 'k')
>>> fig.colorbar(im, orientation= 'horizontal')
>>> im = ax.imshow(img,
cmap= 'seismic' )
>>> fig, ax = plt.subplots()
>>> ax.scatter(x,y,marker= ".")
>>> ax.plot(x,y,marker= "o")
>>> plt.plot(x,y,linewidth=4.0)
>>> plt.plot(x,y,ls= 'solid')
>>> plt.plot(x,y,ls= '--')
>>> plt.plot(x,y,'--' ,x**2,y**2,'-.' )
>>> plt.setp(lines,color= 'r',linewidth=4.0)
>>> ax.text(1,
-2.1,
'Example Graph',
style= 'italic' )
>>> ax.annotate("Sine",
xy=(8, 0),
xycoords= 'data',
xytext=(10.5, 0),
textcoords= 'data',
arrowprops=dict(arrowstyle= "->",
connectionstyle="arc3"),)
>>> plt.title(r '$sigma_i=15$', fontsize=20)
Limits & Autoscaling
>>> ax.margins(x=0.0,y=0.1) #Add padding to a plot
>>> ax.axis('equal') #Set the aspect ratio of the plot to 1
>>> ax.set(xlim=[0,10.5],ylim=[-1.5,1.5]) #Set limits for x-and y-axis
>>> ax.set_xlim(0,10.5) #Set limits for x-axis
Legends
>>> ax.set(title= 'An Example Axes', #Set a title and x-and y-axis labels
ylabel= 'Y-Axis',
xlabel= 'X-Axis')
>>> ax.legend(loc= 'best') #No overlapping plot elements
Ticks
>>> ax.xaxis.set(ticks=range(1,5), #Manually set x-ticks
ticklabels=[3,100, 12,"foo" ])
>>> ax.tick_params(axis= 'y', #Make y-ticks longer and go in and out
direction= 'inout',
length=10)
Subplot Spacing
>>> fig3.subplots_adjust(wspace=0.5, #Adjust the spacing between subplots
hspace=0.3,
left=0.125,
right=0.9,
top=0.9,
bottom=0.1)
>>> fig.tight_layout() #Fit subplot(s) in to the figure area
Axis Spines
>>> ax1.spines[ 'top'].set_visible(False) #Make the top axis line for a plot invisible
>>> ax1.spines['bottom' ].set_position(( 'outward',10)) #Move the bottom axis line outward
Have this Cheat Sheet at your fingertips
Original article source at https://www.datacamp.com
#matplotlib #cheatsheet #python
1620466520
If you accumulate data on which you base your decision-making as an organization, you should probably think about your data architecture and possible best practices.
If you accumulate data on which you base your decision-making as an organization, you most probably need to think about your data architecture and consider possible best practices. Gaining a competitive edge, remaining customer-centric to the greatest extent possible, and streamlining processes to get on-the-button outcomes can all be traced back to an organization’s capacity to build a future-ready data architecture.
In what follows, we offer a short overview of the overarching capabilities of data architecture. These include user-centricity, elasticity, robustness, and the capacity to ensure the seamless flow of data at all times. Added to these are automation enablement, plus security and data governance considerations. These points from our checklist for what we perceive to be an anticipatory analytics ecosystem.
#big data #data science #big data analytics #data analysis #data architecture #data transformation #data platform #data strategy #cloud data platform #data acquisition
1620629020
The opportunities big data offers also come with very real challenges that many organizations are facing today. Often, it’s finding the most cost-effective, scalable way to store and process boundless volumes of data in multiple formats that come from a growing number of sources. Then organizations need the analytical capabilities and flexibility to turn this data into insights that can meet their specific business objectives.
This Refcard dives into how a data lake helps tackle these challenges at both ends — from its enhanced architecture that’s designed for efficient data ingestion, storage, and management to its advanced analytics functionality and performance flexibility. You’ll also explore key benefits and common use cases.
As technology continues to evolve with new data sources, such as IoT sensors and social media churning out large volumes of data, there has never been a better time to discuss the possibilities and challenges of managing such data for varying analytical insights. In this Refcard, we dig deep into how data lakes solve the problem of storing and processing enormous amounts of data. While doing so, we also explore the benefits of data lakes, their use cases, and how they differ from data warehouses (DWHs).
This is a preview of the Getting Started With Data Lakes Refcard. To read the entire Refcard, please download the PDF from the link above.
#big data #data analytics #data analysis #business analytics #data warehouse #data storage #data lake #data lake architecture #data lake governance #data lake management
1650960540
Python has a set of built-in methods that you can use on strings.
Note: All string methods returns new values. They do not change the original string.
Method | Description |
---|---|
capitalize() | Converts the first character to upper case |
casefold() | Converts string into lower case |
center() | Returns a centered string |
count() | Returns the number of times a specified value occurs in a string |
encode() | Returns an encoded version of the string |
endswith() | Returns true if the string ends with the specified value |
expandtabs() | Sets the tab size of the string |
find() | Searches the string for a specified value and returns the position of where it was found |
format() | Formats specified values in a string |
format_map() | Formats specified values in a string |
index() | Searches the string for a specified value and returns the position of where it was found |
isalnum() | Returns True if all characters in the string are alphanumeric |
isalpha() | Returns True if all characters in the string are in the alphabet |
isascii() | Returns True if all characters in the string are ascii characters |
isdecimal() | Returns True if all characters in the string are decimals |
isdigit() | Returns True if all characters in the string are digits |
isidentifier() | Returns True if the string is an identifier |
islower() | Returns True if all characters in the string are lower case |
isnumeric() | Returns True if all characters in the string are numeric |
isprintable() | Returns True if all characters in the string are printable |
isspace() | Returns True if all characters in the string are whitespaces |
istitle() | Returns True if the string follows the rules of a title |
isupper() | Returns True if all characters in the string are upper case |
join() | Converts the elements of an iterable into a string |
ljust() | Returns a left justified version of the string |
lower() | Converts a string into lower case |
lstrip() | Returns a left trim version of the string |
maketrans() | Returns a translation table to be used in translations |
partition() | Returns a tuple where the string is parted into three parts |
replace() | Returns a string where a specified value is replaced with a specified value |
rfind() | Searches the string for a specified value and returns the last position of where it was found |
rindex() | Searches the string for a specified value and returns the last position of where it was found |
rjust() | Returns a right justified version of the string |
rpartition() | Returns a tuple where the string is parted into three parts |
rsplit() | Splits the string at the specified separator, and returns a list |
rstrip() | Returns a right trim version of the string |
split() | Splits the string at the specified separator, and returns a list |
splitlines() | Splits the string at line breaks and returns a list |
startswith() | Returns true if the string starts with the specified value |
strip() | Returns a trimmed version of the string |
swapcase() | Swaps cases, lower case becomes upper case and vice versa |
title() | Converts the first character of each word to upper case |
translate() | Returns a translated string |
upper() | Converts a string into upper case |
zfill() | Fills the string with a specified number of 0 values at the beginning |
Upper case the first letter in this sentence:
txt = "hello, and welcome to my world."
x = txt.capitalize()
print (x)
The capitalize()
method returns a string where the first character is upper case, and the rest is lower case.
string.capitalize()
No parameters
The first character is converted to upper case, and the rest are converted to lower case:
txt = "python is FUN!"
x = txt.capitalize()
print (x)
See what happens if the first character is a number:
txt = "36 is my age."
x = txt.capitalize()
print (x)
Make the string lower case:
txt = "Hello, And Welcome To My World!"
x = txt.casefold()
print(x)
The casefold()
method returns a string where all the characters are lower case.
This method is similar to the lower()
method, but the casefold()
method is stronger, more aggressive, meaning that it will convert more characters into lower case, and will find more matches when comparing two strings and both are converted using the casefold()
method.
string.casefold()
No parameters
Print the word "banana", taking up the space of 20 characters, with "banana" in the middle:
txt = "banana"
x = txt.center(20)
print(x)
The center()
method will center align the string, using a specified character (space is default) as the fill character.
string.center(length, character)
Parameter | Description |
---|---|
length | Required. The length of the returned string |
character | Optional. The character to fill the missing space on each side. Default is " " (space) |
Using the letter "O" as the padding character:
txt = "banana"
x = txt.center(20, "O")
print(x)
Return the number of times the value "apple" appears in the string:
txt = "I love apples, apple are my favorite fruit"
x = txt.count("apple")
print(x)
The count()
method returns the number of times a specified value appears in the string.
string.count(value, start, end)
Parameter | Description |
---|---|
value | Required. A String. The string to value to search for |
start | Optional. An Integer. The position to start the search. Default is 0 |
end | Optional. An Integer. The position to end the search. Default is the end of the string |
Search from position 10 to 24:
txt = "I love apples, apple are my favorite fruit"
x = txt.count("apple", 10, 24)
print(x
UTF-8 encode the string:
txt = "My name is Ståle"
x = txt.encode()
print(x)
The encode()
method encodes the string, using the specified encoding. If no encoding is specified, UTF-8 will be used.
string.encode(encoding=encoding, errors=errors)
Parameter | Description | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
encoding | Optional. A String specifying the encoding to use. Default is UTF-8 | ||||||||||||
errors | Optional. A String specifying the error method. Legal values are:
|
These examples uses ascii encoding, and a character that cannot be encoded, showing the result with different errors:
txt = "My name is Ståle"
print(txt.encode(encoding="ascii",errors="backslashreplace"))
print(txt.encode(encoding="ascii",errors="ignore"))
print(txt.encode(encoding="ascii",errors="namereplace"))
print(txt.encode(encoding="ascii",errors="replace"))
print(txt.encode(encoding="ascii",errors="xmlcharrefreplace"))
Check if the string ends with a punctuation sign (.):
txt = "Hello, welcome to my world."
x = txt.endswith(".")
print(x)
The endswith()
method returns True if the string ends with the specified value, otherwise False.
string.endswith(value, start, end)
Parameter | Description |
---|---|
value | Required. The value to check if the string ends with |
start | Optional. An Integer specifying at which position to start the search |
end | Optional. An Integer specifying at which position to end the search |
Check if the string ends with the phrase "my world.":
txt = "Hello, welcome to my world."
x = txt.endswith("my world.")
print(x)
Check if position 5 to 11 ends with the phrase "my world.":
txt = "Hello, welcome to my world."
x = txt.endswith("my world.", 5, 11)
print(x)
Set the tab size to 2 whitespaces:
txt = "H\te\tl\tl\to"
x = txt.expandtabs(2)
print(x)
The expandtabs()
method sets the tab size to the specified number of whitespaces.
string.expandtabs(tabsize)
Parameter | Description |
---|---|
tabsize | Optional. A number specifying the tabsize. Default tabsize is 8 |
See the result using different tab sizes:
txt = "H\te\tl\tl\to"
print(txt)
print(txt.expandtabs())
print(txt.expandtabs(2))
print(txt.expandtabs(4))
print(txt.expandtabs(10))
Where in the text is the word "welcome"?:
txt = "Hello, welcome to my world."
x = txt.find("welcome")
print(x)
The find()
method finds the first occurrence of the specified value.
The find()
method returns -1 if the value is not found.
The find()
method is almost the same as the index()
method, the only difference is that the index()
method raises an exception if the value is not found. (See example below)
string.find(value, start, end)
Parameter | Description |
---|---|
value | Required. The value to search for |
start | Optional. Where to start the search. Default is 0 |
end | Optional. Where to end the search. Default is to the end of the string |
Where in the text is the first occurrence of the letter "e"?:
txt = "Hello, welcome to my world."
x = txt.find("e")
print(x)
Where in the text is the first occurrence of the letter "e" when you only search between position 5 and 10?:
txt = "Hello, welcome to my world."
x = txt.find("e", 5, 10)
print(x)
If the value is not found, the find() method returns -1, but the index() method will raise an exception:
txt = "Hello, welcome to my world."
print(txt.find("q"))
print(txt.index("q"))
Insert the price inside the placeholder, the price should be in fixed point, two-decimal format:
txt = "For only {price:.2f} dollars!"
print(txt.format(price = 49))
The format()
method formats the specified value(s) and insert them inside the string's placeholder.
The placeholder is defined using curly brackets: {}. Read more about the placeholders in the Placeholder section below.
The format()
method returns the formatted string.
string.format(value1, value2...)
Parameter | Description |
---|---|
value1, value2... | Required. One or more values that should be formatted and inserted in the string. The values are either a list of values separated by commas, a key=value list, or a combination of both. The values can be of any data type. |
The placeholders can be identified using named indexes {price}
, numbered indexes {0}
, or even empty placeholders {}
.
Using different placeholder values:
txt1 = "My name is {fname}, I'm {age}".format(fname = "John", age = 36)
txt2 = "My name is {0}, I'm {1}".format("John",36)
txt3 = "My name is {}, I'm {}".format("John",36)
Inside the placeholders you can add a formatting type to format the result:
:< | Left aligns the result (within the available space) | |
:> | Right aligns the result (within the available space) | |
:^ | Center aligns the result (within the available space) | |
:= | Places the sign to the left most position | |
:+ | Use a plus sign to indicate if the result is positive or negative | |
:- | Use a minus sign for negative values only | |
: | Use a space to insert an extra space before positive numbers (and a minus sign before negative numbers) | |
:, | Use a comma as a thousand separator | |
:_ | Use a underscore as a thousand separator | |
:b | Binary format | |
:c | Converts the value into the corresponding unicode character | |
:d | Decimal format | |
:e | Scientific format, with a lower case e | |
:E | Scientific format, with an upper case E | |
:f | Fix point number format | |
:F | Fix point number format, in uppercase format (show inf and nan as INF and NAN ) | |
:g | General format | |
:G | General format (using a upper case E for scientific notations) | |
:o | Octal format | |
:x | Hex format, lower case | |
:X | Hex format, upper case | |
:n | Number format | |
:% | Percentage format |
Where in the text is the word "welcome"?:
txt = "Hello, welcome to my world."
x = txt.index("welcome")
print(x)
The index()
method finds the first occurrence of the specified value.
The index()
method raises an exception if the value is not found.
The index()
method is almost the same as the find()
method, the only difference is that the find()
method returns -1 if the value is not found. (See example below)
string.index(value, start, end)
Parameter | Description |
---|---|
value | Required. The value to search for |
start | Optional. Where to start the search. Default is 0 |
end | Optional. Where to end the search. Default is to the end of the string |
Where in the text is the first occurrence of the letter "e"?:
txt = "Hello, welcome to my world."
x = txt.index("e")
print(x)
Where in the text is the first occurrence of the letter "e" when you only search between position 5 and 10?:
txt = "Hello, welcome to my world."
x = txt.index("e", 5, 10)
print(x)
If the value is not found, the find() method returns -1, but the index() method will raise an exception:
txt = "Hello, welcome to my world."
print(txt.find("q"))
print(txt.index("q"))
Check if all the characters in the text are alphanumeric:
txt = "Company12"
x = txt.isalnum()
print(x)
The isalnum()
method returns True if all the characters are alphanumeric, meaning alphabet letter (a-z) and numbers (0-9).
Example of characters that are not alphanumeric: (space)!#%&? etc.
string.isalnum()
No parameters.
Check if all the characters in the text is alphanumeric:
txt = "Company 12"
x = txt.isalnum()
print(x)
Check if all the characters in the text are letters:
txt = "CompanyX"
x = txt.isalpha()
print(x)
The isalpha()
method returns True if all the characters are alphabet letters (a-z).
Example of characters that are not alphabet letters: (space)!#%&? etc.
string.isalpha()
No parameters.
Check if all the characters in the text is alphabetic:
txt = "Company10"
x = txt.isalpha()
print(x)
Check if all the characters in the text are ascii characters:
txt = "Company123"
x = txt.isascii()
print(x)
The isascii()
method returns True if all the characters are ascii characters (a-z).
string.isascii()
No parameters.
Check if all the characters in the unicode object are decimals:
txt = "\u0033" #unicode for 3
x = txt.isdecimal()
print(x)
The isdecimal()
method returns True if all the characters are decimals (0-9).
This method is used on unicode objects.
string.isdecimal()
No parameters.
Check if all the characters in the unicode are decimals:
a = "\u0030" #unicode for 0
b = "\u0047" #unicode for G
print(a.isdecimal())
print(b.isdecimal())
Check if all the characters in the text are digits:
txt = "50800"
x = txt.isdigit()
print(x)
The isdigit()
method returns True if all the characters are digits, otherwise False.
Exponents, like ², are also considered to be a digit.
string.isdigit()
No parameters.
Check if all the characters in the text are digits:
a = "\u0030" #unicode for 0
b = "\u00B2" #unicode for ²
print(a.isdigit())
print(b.isdigit())
Check if the string is a valid identifier:
txt = "Demo"
x = txt.isidentifier()
print(x)
The isidentifier()
method returns True if the string is a valid identifier, otherwise False.
A string is considered a valid identifier if it only contains alphanumeric letters (a-z) and (0-9), or underscores (_). A valid identifier cannot start with a number, or contain any spaces.
string.isidentifier()
No parameters.
Check if the strings are valid identifiers:
a = "MyFolder"
b = "Demo002"
c = "2bring"
d = "my demo"
print(a.isidentifier())
print(b.isidentifier())
print(c.isidentifier())
print(d.isidentifier())
Check if all the characters in the text are in lower case:
txt = "hello world!"
x = txt.islower()
print(x)
The islower()
method returns True if all the characters are in lower case, otherwise False.
Numbers, symbols and spaces are not checked, only alphabet characters.
string.islower()
No parameters.
Check if all the characters in the texts are in lower case:
a = "Hello world!"
b = "hello 123"
c = "mynameisPeter"
print(a.islower())
print(b.islower())
print(c.islower())
Check if all the characters in the text are numeric:
txt = "565543"
x = txt.isnumeric()
print(x)
The isnumeric()
method returns True if all the characters are numeric (0-9), otherwise False.
Exponents, like ² and ¾ are also considered to be numeric values.
"-1"
and "1.5"
are NOT considered numeric values, because all the characters in the string must be numeric, and the -
and the .
are not.
string.isnumeric()
No parameters.
Check if the characters are numeric:
a = "\u0030" #unicode for 0
b = "\u00B2" #unicode for ²
c = "10km2"
d = "-1"
e = "1.5"
print(a.isnumeric())
print(b.isnumeric())
print(c.isnumeric())
print(d.isnumeric())
print(e.isnumeric())
Check if all the characters in the text are printable:
txt = "Hello! Are you #1?"
x = txt.isprintable()
print(x)
The isprintable()
method returns True if all the characters are printable, otherwise False.
Example of none printable character can be carriage return and line feed.
string.isprintable()
No parameters.
Check if all the characters in the text are printable:
txt = "Hello!\nAre you #1?"
x = txt.isprintable()
print(x)
Check if all the characters in the text are whitespaces:
txt = " "
x = txt.isspace()
print(x)
The isspace()
method returns True if all the characters in a string are whitespaces, otherwise False.
string.isspace()
No parameters.
Check if all the characters in the text are whitespaces:
txt = " s "
x = txt.isspace()
print(x)
Check if each word start with an upper case letter:
txt = "Hello, And Welcome To My World!"
x = txt.istitle()
print(x)
The istitle()
method returns True if all words in a text start with a upper case letter, AND the rest of the word are lower case letters, otherwise False.
Symbols and numbers are ignored.
string.istitle()
No parameters.
Check if each word start with an upper case letter:
a = "HELLO, AND WELCOME TO MY WORLD"
b = "Hello"
c = "22 Names"
d = "This Is %'!?"
print(a.istitle())
print(b.istitle())
print(c.istitle())
print(d.istitle())
Check if all the characters in the text are in upper case:
txt = "THIS IS NOW!"
x = txt.isupper()
print(x)
The isupper()
method returns True if all the characters are in upper case, otherwise False.
Numbers, symbols and spaces are not checked, only alphabet characters.
string.isupper()
No parameters.
Check if all the characters in the texts are in upper case:
a = "Hello World!"
b = "hello 123"
c = "MY NAME IS PETER"
print(a.isupper())
print(b.isupper())
print(c.isupper())
Join all items in a tuple into a string, using a hash character as separator:
myTuple = ("John", "Peter", "Vicky")
x = "#".join(myTuple)
print(x)
The join()
method takes all items in an iterable and joins them into one string.
A string must be specified as the separator.
string.join(iterable)
Parameter | Description |
---|---|
iterable | Required. Any iterable object where all the returned values are strings |
Join all items in a dictionary into a string, using the word "TEST" as separator:
myDict = {"name": "John", "country": "Norway"}
mySeparator = "TEST"
x = mySeparator.join(myDict)
print(x)
Return a 20 characters long, left justified version of the word "banana":
txt = "banana"
x = txt.ljust(20)
print(x, "is my favorite fruit.")
Note: In the result, there are actually 14 whitespaces to the right of the word banana.
The ljust()
method will left align the string, using a specified character (space is default) as the fill character.
string.ljust(length, character)
Parameter | Description |
---|---|
length | Required. The length of the returned string |
character | Optional. A character to fill the missing space (to the right of the string). Default is " " (space). |
Using the letter "O" as the padding character:
txt = "banana"
x = txt.ljust(20, "O")
print(x)
Lower case the string:
txt = "Hello my FRIENDS"
x = txt.lower()
print(x)
The lower()
method returns a string where all characters are lower case.
Symbols and Numbers are ignored.
string.lower()
No parameters
Remove spaces to the left of the string:
txt = " banana "
x = txt.lstrip()
print("of all fruits", x, "is my favorite")
The lstrip()
method removes any leading characters (space is the default leading character to remove)
string.lstrip(characters)
Parameter | Description |
---|---|
characters | Optional. A set of characters to remove as leading characters |
Remove the leading characters:
txt = ",,,,,ssaaww.....banana"
x = txt.lstrip(",.asw")
print(x)
Create a mapping table, and use it in the translate()
method to replace any "S" characters with a "P" character:
txt = "Hello Sam!"
mytable = txt.maketrans("S", "P")
print(txt.translate(mytable))
The maketrans()
method returns a mapping table that can be used with the translate()
method to replace specified characters.
string.maketrans(x, y, z)
Parameter | Description |
---|---|
x | Required. If only one parameter is specified, this has to be a dictionary describing how to perform the replace. If two or more parameters are specified, this parameter has to be a string specifying the characters you want to replace. |
y | Optional. A string with the same length as parameter x. Each character in the first parameter will be replaced with the corresponding character in this string. |
z | Optional. A string describing which characters to remove from the original string. |
Use a mapping table to replace many characters:
txt = "Hi Sam!"
x = "mSa"
y = "eJo"
mytable = txt.maketrans(x, y)
print(txt.translate(mytable))
The third parameter in the mapping table describes characters that you want to remove from the string:
txt = "Good night Sam!"
x = "mSa"
y = "eJo"
z = "odnght"
mytable = txt.maketrans(x, y, z)
print(txt.translate(mytable))
The maketrans()
method itself returns a dictionary describing each replacement, in unicode:
txt = "Good night Sam!"
x = "mSa"
y = "eJo"
z = "odnght"
print(txt.maketrans(x, y, z))
Search for the word "bananas", and return a tuple with three elements:
1 - everything before the "match"
2 - the "match"
3 - everything after the "match"
txt = "I could eat bananas all day"
x = txt.partition("bananas")
print(x)
The partition()
method searches for a specified string, and splits the string into a tuple containing three elements.
The first element contains the part before the specified string.
The second element contains the specified string.
The third element contains the part after the string.
Note: This method searches for the first occurrence of the specified string.
string.partition(value)
Parameter | Description |
---|---|
value | Required. The string to search for |
If the specified value is not found, the partition() method returns a tuple containing: 1 - the whole string, 2 - an empty string, 3 - an empty string:
txt = "I could eat bananas all day"
x = txt.partition("apples")
print(x)
Replace the word "bananas":
txt = "I like bananas"
x = txt.replace("bananas", "apples")
print(x)
The replace()
method replaces a specified phrase with another specified phrase.
Note: All occurrences of the specified phrase will be replaced, if nothing else is specified.
string.replace(oldvalue, newvalue, count)
Parameter | Description |
---|---|
oldvalue | Required. The string to search for |
newvalue | Required. The string to replace the old value with |
count | Optional. A number specifying how many occurrences of the old value you want to replace. Default is all occurrences |
Replace all occurrence of the word "one":
txt = "one one was a race horse, two two was one too."
x = txt.replace("one", "three")
print(x)
Replace the two first occurrence of the word "one":
txt = "one one was a race horse, two two was one too."
x = txt.replace("one", "three", 2)
print(x)
Where in the text is the last occurrence of the string "casa"?:
txt = "Mi casa, su casa."
x = txt.rfind("casa")
print(x)
The rfind()
method finds the last occurrence of the specified value.
The rfind()
method returns -1 if the value is not found.
The rfind()
method is almost the same as the rindex()
method. See example below.
string.rfind(value, start, end)
Parameter | Description |
---|---|
value | Required. The value to search for |
start | Optional. Where to start the search. Default is 0 |
end | Optional. Where to end the search. Default is to the end of the string |
Where in the text is the last occurrence of the letter "e"?:
txt = "Hello, welcome to my world."
x = txt.rfind("e")
print(x)
Where in the text is the last occurrence of the letter "e" when you only search between position 5 and 10?:
txt = "Hello, welcome to my world."
x = txt.rfind("e", 5, 10)
print(x)
If the value is not found, the rfind() method returns -1, but the rindex() method will raise an exception:
txt = "Hello, welcome to my world."
print(txt.rfind("q"))
print(txt.rindex("q"))
Where in the text is the last occurrence of the string "casa"?:
txt = "Mi casa, su casa."
x = txt.rindex("casa")
print(x)
The rindex()
method finds the last occurrence of the specified value.
The rindex()
method raises an exception if the value is not found.
The rindex()
method is almost the same as the rfind()
method. See example below.
string.rindex(value, start, end)
Parameter | Description |
---|---|
value | Required. The value to search for |
start | Optional. Where to start the search. Default is 0 |
end | Optional. Where to end the search. Default is to the end of the string |
Where in the text is the last occurrence of the letter "e"?:
txt = "Hello, welcome to my world."
x = txt.rindex("e")
print(x)
Where in the text is the last occurrence of the letter "e" when you only search between position 5 and 10?:
txt = "Hello, welcome to my world."
x = txt.rindex("e", 5, 10)
print(x)
If the value is not found, the rfind() method returns -1, but the rindex() method will raise an exception:
txt = "Hello, welcome to my world."
print(txt.rfind("q"))
print(txt.rindex("q"))
Return a 20 characters long, right justified version of the word "banana":
txt = "banana"
x = txt.rjust(20)
print(x, "is my favorite fruit.")
Note: In the result, there are actually 14 whitespaces to the left of the word banana.
The rjust()
method will right align the string, using a specified character (space is default) as the fill character.
string.rjust(length, character)
Parameter | Description |
---|---|
length | Required. The length of the returned string |
character | Optional. A character to fill the missing space (to the left of the string). Default is " " (space). |
Using the letter "O" as the padding character:
txt = "banana"
x = txt.rjust(20, "O")
print(x)
Search for the last occurrence of the word "bananas", and return a tuple with three elements:
1 - everything before the "match"
2 - the "match"
3 - everything after the "match"
txt = "I could eat bananas all day, bananas are my favorite fruit"
x = txt.rpartition("bananas")
print(x)
The rpartition()
method searches for the last occurrence of a specified string, and splits the string into a tuple containing three elements.
The first element contains the part before the specified string.
The second element contains the specified string.
The third element contains the part after the string.
string.rpartition(value)
Parameter | Description |
---|---|
value | Required. The string to search for |
If the specified value is not found, the rpartition() method returns a tuple containing: 1 - an empty string, 2 - an empty string, 3 - the whole string:
txt = "I could eat bananas all day, bananas are my favorite fruit"
x = txt.rpartition("apples")
print(x)
Split a string into a list, using comma, followed by a space (, ) as the separator:
txt = "apple, banana, cherry"
x = txt.rsplit(", ")
print(x)
The rsplit()
method splits a string into a list, starting from the right.
If no "max" is specified, this method will return the same as the split()
method.
Note: When maxsplit is specified, the list will contain the specified number of elements plus one.
string.rsplit(separator, maxsplit)
Parameter | Description |
---|---|
separator | Optional. Specifies the separator to use when splitting the string. By default any whitespace is a separator |
maxsplit | Optional. Specifies how many splits to do. Default value is -1, which is "all occurrences" |
Split the string into a list with maximum 2 items:
txt = "apple, banana, cherry"
# setting the maxsplit parameter to 1, will return a list with 2 elements!
x = txt.rsplit(", ", 1)
print(x)
Remove any white spaces at the end of the string:
txt = " banana "
x = txt.rstrip()
print("of all fruits", x, "is my favorite")
The rstrip()
method removes any trailing characters (characters at the end a string), space is the default trailing character to remove.
string.rstrip(characters)
Parameter | Description |
---|---|
characters | Optional. A set of characters to remove as trailing characters |
Remove the trailing characters if they are commas, s, q, or w:
txt = "banana,,,,,ssqqqww....."
x = txt.rstrip(",.qsw")
print(x)
Split a string into a list where each word is a list item:
txt = "welcome to the jungle"
x = txt.split()
print(x)
The split()
method splits a string into a list.
You can specify the separator, default separator is any whitespace.
Note: When maxsplit is specified, the list will contain the specified number of elements plus one.
string.split(separator, maxsplit)
Parameter | Description |
---|---|
separator | Optional. Specifies the separator to use when splitting the string. By default any whitespace is a separator |
maxsplit | Optional. Specifies how many splits to do. Default value is -1, which is "all occurrences" |
Split the string, using comma, followed by a space, as a separator:
txt = "hello, my name is Peter, I am 26 years old"
x = txt.split(", ")
print(x)
Use a hash character as a separator:
txt = "apple#banana#cherry#orange"
x = txt.split("#")
print(x)
Split the string into a list with max 2 items:
txt = "apple#banana#cherry#orange"
# setting the maxsplit parameter to 1, will return a list with 2 elements!
x = txt.split("#", 1)
print(x)
Split a string into a list where each line is a list item:
txt = "Thank you for the music\nWelcome to the jungle"
x = txt.splitlines()
print(x)
The splitlines()
method splits a string into a list. The splitting is done at line breaks.
string.splitlines(keeplinebreaks)
Parameter | Description |
---|---|
keeplinebreaks | Optional. Specifies if the line breaks should be included (True), or not (False). Default value is False |
Split the string, but keep the line breaks:
txt = "Thank you for the music\nWelcome to the jungle"
x = txt.splitlines(True)
print(x)
Check if the string starts with "Hello":
txt = "Hello, welcome to my world."
x = txt.startswith("Hello")
print(x)
The startswith()
method returns True if the string starts with the specified value, otherwise False.
string.startswith(value, start, end)
Parameter | Description |
---|---|
value | Required. The value to check if the string starts with |
start | Optional. An Integer specifying at which position to start the search |
end | Optional. An Integer specifying at which position to end the search |
Check if position 7 to 20 starts with the characters "wel":
txt = "Hello, welcome to my world."
x = txt.startswith("wel", 7, 20)
print(x)
Remove spaces at the beginning and at the end of the string:
txt = " banana "
x = txt.strip()
print("of all fruits", x, "is my favorite")
The strip()
method removes any leading (spaces at the beginning) and trailing (spaces at the end) characters (space is the default leading character to remove)
string.strip(characters)
Parameter | Description |
---|---|
characters | Optional. A set of characters to remove as leading/trailing characters |
Remove the leading and trailing characters:
txt = ",,,,,rrttgg.....banana....rrr"
x = txt.strip(",.grt")
print(x)
Make the lower case letters upper case and the upper case letters lower case:
txt = "Hello My Name Is PETER"
x = txt.swapcase()
print(x)
The swapcase()
method returns a string where all the upper case letters are lower case and vice versa.
string.swapcase()
No parameters.
Make the first letter in each word upper case:
txt = "Welcome to my world"
x = txt.title()
print(x)
The title()
method returns a string where the first character in every word is upper case. Like a header, or a title.
If the word contains a number or a symbol, the first letter after that will be converted to upper case.
string.title()
No parameters.
Make the first letter in each word upper case:
txt = "Welcome to my 2nd world"
x = txt.title()
print(x)
Note that the first letter after a non-alphabet letter is converted into a upper case letter:
txt = "hello b2b2b2 and 3g3g3g"
x = txt.title()
print(x)
Replace any "S" characters with a "P" character:
#use a dictionary with ascii codes to replace 83 (S) with 80 (P):
mydict = {83: 80}
txt = "Hello Sam!"
print(txt.translate(mydict))
The translate()
method returns a string where some specified characters are replaced with the character described in a dictionary, or in a mapping table.
Use the maketrans()
method to create a mapping table.
If a character is not specified in the dictionary/table, the character will not be replaced.
If you use a dictionary, you must use ascii codes instead of characters.
string.translate(table)
Parameter | Description |
---|---|
table | Required. Either a dictionary, or a mapping table describing how to perform the replace |
Use a mapping table to replace "S" with "P":
txt = "Hello Sam!"
mytable = txt.maketrans("S", "P")
print(txt.translate(mytable))
Use a mapping table to replace many characters:
txt = "Hi Sam!"
x = "mSa"
y = "eJo"
mytable = txt.maketrans(x, y)
print(txt.translate(mytable))
The third parameter in the mapping table describes characters that you want to remove from the string:
txt = "Good night Sam!"
x = "mSa"
y = "eJo"
z = "odnght"
mytable = txt.maketrans(x, y, z)
print(txt.translate(mytable))
The same example as above, but using a dictionary instead of a mapping table:
txt = "Good night Sam!"
mydict = {109: 101, 83: 74, 97: 111, 111: None, 100: None, 110: None, 103: None, 104: None, 116: None}
print(txt.translate(mydict))
Upper case the string:
txt = "Hello my friends"
x = txt.upper()
print(x)
The upper()
method returns a string where all characters are in upper case.
Symbols and Numbers are ignored.
string.upper()
No parameters
Fill the string with zeros until it is 10 characters long:
txt = "50"
x = txt.zfill(10)
print(x)
The zfill()
method adds zeros (0) at the beginning of the string, until it reaches the specified length.
If the value of the len parameter is less than the length of the string, no filling is done.
string.zfill(len)
Parameter | Description |
---|---|
len | Required. A number specifying the desired length of the string |
Fill the strings with zeros until they are 10 characters long:
a = "hello"
b = "welcome to the jungle"
c = "10.000"
print(a.zfill(10))
print(b.zfill(10))
print(c.zfill(10))
#python #programming #developer