Python File Handling: Create, Open, Append, Read, Write

In this Python Tutorial for Beginners video I am going to show you How to Create a Text File and Write in It Using Python . To open a file in Python we use open() function.open() returns a file object, and is most commonly used with two arguments: open(filename, mode).Before we can write to the file we must tell Python which file we are going to work with and what we will be doing with the file. This is done with the open() function. open() returns a “file handle” - a variable used to perform operations on the file
Kind of like “File - Open” in a Word Processor.
name = open(“filename”)- opens the given file for reading, and returns a file object
name.read() - file’s entire contents as a string
name.readline() - next line from file as a string
name.readlines() - file’s contents as a list of lines
The “r” is called the “access mode”. r is for reading, if the file does not exist, an error is raised
w is for writing- If the file exists, the contents are overwritten. If the file does not exist, it will be created
a is for appending- If the file exists new data is appended to the end of the file. If the file does not exist, it will be created

#python

What is GEEK

Buddha Community

Python File Handling: Create, Open, Append, Read, Write
Easter  Deckow

Easter Deckow

1655630160

PyTumblr: A Python Tumblr API v2 Client

PyTumblr

Installation

Install via pip:

$ pip install pytumblr

Install from source:

$ git clone https://github.com/tumblr/pytumblr.git
$ cd pytumblr
$ python setup.py install

Usage

Create a client

A pytumblr.TumblrRestClient is the object you'll make all of your calls to the Tumblr API through. Creating one is this easy:

client = pytumblr.TumblrRestClient(
    '<consumer_key>',
    '<consumer_secret>',
    '<oauth_token>',
    '<oauth_secret>',
)

client.info() # Grabs the current user information

Two easy ways to get your credentials to are:

  1. The built-in interactive_console.py tool (if you already have a consumer key & secret)
  2. The Tumblr API console at https://api.tumblr.com/console
  3. Get sample login code at https://api.tumblr.com/console/calls/user/info

Supported Methods

User Methods

client.info() # get information about the authenticating user
client.dashboard() # get the dashboard for the authenticating user
client.likes() # get the likes for the authenticating user
client.following() # get the blogs followed by the authenticating user

client.follow('codingjester.tumblr.com') # follow a blog
client.unfollow('codingjester.tumblr.com') # unfollow a blog

client.like(id, reblogkey) # like a post
client.unlike(id, reblogkey) # unlike a post

Blog Methods

client.blog_info(blogName) # get information about a blog
client.posts(blogName, **params) # get posts for a blog
client.avatar(blogName) # get the avatar for a blog
client.blog_likes(blogName) # get the likes on a blog
client.followers(blogName) # get the followers of a blog
client.blog_following(blogName) # get the publicly exposed blogs that [blogName] follows
client.queue(blogName) # get the queue for a given blog
client.submission(blogName) # get the submissions for a given blog

Post Methods

Creating posts

PyTumblr lets you create all of the various types that Tumblr supports. When using these types there are a few defaults that are able to be used with any post type.

The default supported types are described below.

  • state - a string, the state of the post. Supported types are published, draft, queue, private
  • tags - a list, a list of strings that you want tagged on the post. eg: ["testing", "magic", "1"]
  • tweet - a string, the string of the customized tweet you want. eg: "Man I love my mega awesome post!"
  • date - a string, the customized GMT that you want
  • format - a string, the format that your post is in. Support types are html or markdown
  • slug - a string, the slug for the url of the post you want

We'll show examples throughout of these default examples while showcasing all the specific post types.

Creating a photo post

Creating a photo post supports a bunch of different options plus the described default options * caption - a string, the user supplied caption * link - a string, the "click-through" url for the photo * source - a string, the url for the photo you want to use (use this or the data parameter) * data - a list or string, a list of filepaths or a single file path for multipart file upload

#Creates a photo post using a source URL
client.create_photo(blogName, state="published", tags=["testing", "ok"],
                    source="https://68.media.tumblr.com/b965fbb2e501610a29d80ffb6fb3e1ad/tumblr_n55vdeTse11rn1906o1_500.jpg")

#Creates a photo post using a local filepath
client.create_photo(blogName, state="queue", tags=["testing", "ok"],
                    tweet="Woah this is an incredible sweet post [URL]",
                    data="/Users/johnb/path/to/my/image.jpg")

#Creates a photoset post using several local filepaths
client.create_photo(blogName, state="draft", tags=["jb is cool"], format="markdown",
                    data=["/Users/johnb/path/to/my/image.jpg", "/Users/johnb/Pictures/kittens.jpg"],
                    caption="## Mega sweet kittens")

Creating a text post

Creating a text post supports the same options as default and just a two other parameters * title - a string, the optional title for the post. Supports markdown or html * body - a string, the body of the of the post. Supports markdown or html

#Creating a text post
client.create_text(blogName, state="published", slug="testing-text-posts", title="Testing", body="testing1 2 3 4")

Creating a quote post

Creating a quote post supports the same options as default and two other parameter * quote - a string, the full text of the qote. Supports markdown or html * source - a string, the cited source. HTML supported

#Creating a quote post
client.create_quote(blogName, state="queue", quote="I am the Walrus", source="Ringo")

Creating a link post

  • title - a string, the title of post that you want. Supports HTML entities.
  • url - a string, the url that you want to create a link post for.
  • description - a string, the desciption of the link that you have
#Create a link post
client.create_link(blogName, title="I like to search things, you should too.", url="https://duckduckgo.com",
                   description="Search is pretty cool when a duck does it.")

Creating a chat post

Creating a chat post supports the same options as default and two other parameters * title - a string, the title of the chat post * conversation - a string, the text of the conversation/chat, with diablog labels (no html)

#Create a chat post
chat = """John: Testing can be fun!
Renee: Testing is tedious and so are you.
John: Aw.
"""
client.create_chat(blogName, title="Renee just doesn't understand.", conversation=chat, tags=["renee", "testing"])

Creating an audio post

Creating an audio post allows for all default options and a has 3 other parameters. The only thing to keep in mind while dealing with audio posts is to make sure that you use the external_url parameter or data. You cannot use both at the same time. * caption - a string, the caption for your post * external_url - a string, the url of the site that hosts the audio file * data - a string, the filepath of the audio file you want to upload to Tumblr

#Creating an audio file
client.create_audio(blogName, caption="Rock out.", data="/Users/johnb/Music/my/new/sweet/album.mp3")

#lets use soundcloud!
client.create_audio(blogName, caption="Mega rock out.", external_url="https://soundcloud.com/skrillex/sets/recess")

Creating a video post

Creating a video post allows for all default options and has three other options. Like the other post types, it has some restrictions. You cannot use the embed and data parameters at the same time. * caption - a string, the caption for your post * embed - a string, the HTML embed code for the video * data - a string, the path of the file you want to upload

#Creating an upload from YouTube
client.create_video(blogName, caption="Jon Snow. Mega ridiculous sword.",
                    embed="http://www.youtube.com/watch?v=40pUYLacrj4")

#Creating a video post from local file
client.create_video(blogName, caption="testing", data="/Users/johnb/testing/ok/blah.mov")

Editing a post

Updating a post requires you knowing what type a post you're updating. You'll be able to supply to the post any of the options given above for updates.

client.edit_post(blogName, id=post_id, type="text", title="Updated")
client.edit_post(blogName, id=post_id, type="photo", data="/Users/johnb/mega/awesome.jpg")

Reblogging a Post

Reblogging a post just requires knowing the post id and the reblog key, which is supplied in the JSON of any post object.

client.reblog(blogName, id=125356, reblog_key="reblog_key")

Deleting a post

Deleting just requires that you own the post and have the post id

client.delete_post(blogName, 123456) # Deletes your post :(

A note on tags: When passing tags, as params, please pass them as a list (not a comma-separated string):

client.create_text(blogName, tags=['hello', 'world'], ...)

Getting notes for a post

In order to get the notes for a post, you need to have the post id and the blog that it is on.

data = client.notes(blogName, id='123456')

The results include a timestamp you can use to make future calls.

data = client.notes(blogName, id='123456', before_timestamp=data["_links"]["next"]["query_params"]["before_timestamp"])

Tagged Methods

# get posts with a given tag
client.tagged(tag, **params)

Using the interactive console

This client comes with a nice interactive console to run you through the OAuth process, grab your tokens (and store them for future use).

You'll need pyyaml installed to run it, but then it's just:

$ python interactive-console.py

and away you go! Tokens are stored in ~/.tumblr and are also shared by other Tumblr API clients like the Ruby client.

Running tests

The tests (and coverage reports) are run with nose, like this:

python setup.py test

Author: tumblr
Source Code: https://github.com/tumblr/pytumblr
License: Apache-2.0 license

#python #api 

Kennith  Blick

Kennith Blick

1625768100

Reading and Writing to Files in Python - Intermediate Python Tutorial #2

In this Python tutorial, we will learn how to deal with text files in Python using the built-in open function. You will understand how to use the most important modes: read, write and append.
That’s not all! We will discuss about file parsing and touch important string methods used for that such as: strip( ) and split( ). Finally we wrap up with a parsing file exercise to practice the new concepts. After that video you will be confident to deal with text files which is a very important skill to have as a programmer.

Playlist: Intermediate Python Tutorials | Video #2
Access the codes here: https://github.com/rscorrea1/youtube.git

Timestamp:
00:00 - Summary of the video
00:17 - Types of files
00:43 - How to open a file
01:17 - File modes
02:15 - How to read data from a file
03:00 - with statement
04:10 - readlines( ) method
05:05 - String: strip( ) method
06:22 - How to iterate over a file line by lin
08:47 - How to write data to a file
11:43 - How to append data to a file
12:37 - Exercise: Parsing a text file
16:14 - Converting data types
17:00 - Next video announcement

Thumbnail:
Photo by Mario Ho on Unsplash

#reading #writing #python #intermediate python tutorial #reading and writing to files in python

Tamale  Moses

Tamale Moses

1669003576

Exploring Mutable and Immutable in Python

In this Python article, let's learn about Mutable and Immutable in Python. 

Mutable and Immutable in Python

Mutable is a fancy way of saying that the internal state of the object is changed/mutated. So, the simplest definition is: An object whose internal state can be changed is mutable. On the other hand, immutable doesn’t allow any change in the object once it has been created.

Both of these states are integral to Python data structure. If you want to become more knowledgeable in the entire Python Data Structure, take this free course which covers multiple data structures in Python including tuple data structure which is immutable. You will also receive a certificate on completion which is sure to add value to your portfolio.

Mutable Definition

Mutable is when something is changeable or has the ability to change. In Python, ‘mutable’ is the ability of objects to change their values. These are often the objects that store a collection of data.

Immutable Definition

Immutable is the when no change is possible over time. In Python, if the value of an object cannot be changed over time, then it is known as immutable. Once created, the value of these objects is permanent.

List of Mutable and Immutable objects

Objects of built-in type that are mutable are:

  • Lists
  • Sets
  • Dictionaries
  • User-Defined Classes (It purely depends upon the user to define the characteristics) 

Objects of built-in type that are immutable are:

  • Numbers (Integer, Rational, Float, Decimal, Complex & Booleans)
  • Strings
  • Tuples
  • Frozen Sets
  • User-Defined Classes (It purely depends upon the user to define the characteristics)

Object mutability is one of the characteristics that makes Python a dynamically typed language. Though Mutable and Immutable in Python is a very basic concept, it can at times be a little confusing due to the intransitive nature of immutability.

Objects in Python

In Python, everything is treated as an object. Every object has these three attributes:

  • Identity – This refers to the address that the object refers to in the computer’s memory.
  • Type – This refers to the kind of object that is created. For example- integer, list, string etc. 
  • Value – This refers to the value stored by the object. For example – List=[1,2,3] would hold the numbers 1,2 and 3

While ID and Type cannot be changed once it’s created, values can be changed for Mutable objects.

Check out this free python certificate course to get started with Python.

Mutable Objects in Python

I believe, rather than diving deep into the theory aspects of mutable and immutable in Python, a simple code would be the best way to depict what it means in Python. Hence, let us discuss the below code step-by-step:

#Creating a list which contains name of Indian cities  

cities = [‘Delhi’, ‘Mumbai’, ‘Kolkata’]

# Printing the elements from the list cities, separated by a comma & space

for city in cities:
		print(city, end=’, ’)

Output [1]: Delhi, Mumbai, Kolkata

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(cities)))

Output [2]: 0x1691d7de8c8

#Adding a new city to the list cities

cities.append(‘Chennai’)

#Printing the elements from the list cities, separated by a comma & space 

for city in cities:
	print(city, end=’, ’)

Output [3]: Delhi, Mumbai, Kolkata, Chennai

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(cities)))

Output [4]: 0x1691d7de8c8

The above example shows us that we were able to change the internal state of the object ‘cities’ by adding one more city ‘Chennai’ to it, yet, the memory address of the object did not change. This confirms that we did not create a new object, rather, the same object was changed or mutated. Hence, we can say that the object which is a type of list with reference variable name ‘cities’ is a MUTABLE OBJECT.

Let us now discuss the term IMMUTABLE. Considering that we understood what mutable stands for, it is obvious that the definition of immutable will have ‘NOT’ included in it. Here is the simplest definition of immutable– An object whose internal state can NOT be changed is IMMUTABLE.

Again, if you try and concentrate on different error messages, you have encountered, thrown by the respective IDE; you use you would be able to identify the immutable objects in Python. For instance, consider the below code & associated error message with it, while trying to change the value of a Tuple at index 0. 

#Creating a Tuple with variable name ‘foo’

foo = (1, 2)

#Changing the index[0] value from 1 to 3

foo[0] = 3
	
TypeError: 'tuple' object does not support item assignment 

Immutable Objects in Python

Once again, a simple code would be the best way to depict what immutable stands for. Hence, let us discuss the below code step-by-step:

#Creating a Tuple which contains English name of weekdays

weekdays = ‘Sunday’, ‘Monday’, ‘Tuesday’, ‘Wednesday’, ‘Thursday’, ‘Friday’, ‘Saturday’

# Printing the elements of tuple weekdays

print(weekdays)

Output [1]:  (‘Sunday’, ‘Monday’, ‘Tuesday’, ‘Wednesday’, ‘Thursday’, ‘Friday’, ‘Saturday’)

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(weekdays)))

Output [2]: 0x1691cc35090

#tuples are immutable, so you cannot add new elements, hence, using merge of tuples with the # + operator to add a new imaginary day in the tuple ‘weekdays’

weekdays  +=  ‘Pythonday’,

#Printing the elements of tuple weekdays

print(weekdays)

Output [3]: (‘Sunday’, ‘Monday’, ‘Tuesday’, ‘Wednesday’, ‘Thursday’, ‘Friday’, ‘Saturday’, ‘Pythonday’)

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(weekdays)))

Output [4]: 0x1691cc8ad68

This above example shows that we were able to use the same variable name that is referencing an object which is a type of tuple with seven elements in it. However, the ID or the memory location of the old & new tuple is not the same. We were not able to change the internal state of the object ‘weekdays’. The Python program manager created a new object in the memory address and the variable name ‘weekdays’ started referencing the new object with eight elements in it.  Hence, we can say that the object which is a type of tuple with reference variable name ‘weekdays’ is an IMMUTABLE OBJECT.

Also Read: Understanding the Exploratory Data Analysis (EDA) in Python

Where can you use mutable and immutable objects:

Mutable objects can be used where you want to allow for any updates. For example, you have a list of employee names in your organizations, and that needs to be updated every time a new member is hired. You can create a mutable list, and it can be updated easily.

Immutability offers a lot of useful applications to different sensitive tasks we do in a network centred environment where we allow for parallel processing. By creating immutable objects, you seal the values and ensure that no threads can invoke overwrite/update to your data. This is also useful in situations where you would like to write a piece of code that cannot be modified. For example, a debug code that attempts to find the value of an immutable object.

Watch outs:  Non transitive nature of Immutability:

OK! Now we do understand what mutable & immutable objects in Python are. Let’s go ahead and discuss the combination of these two and explore the possibilities. Let’s discuss, as to how will it behave if you have an immutable object which contains the mutable object(s)? Or vice versa? Let us again use a code to understand this behaviour–

#creating a tuple (immutable object) which contains 2 lists(mutable) as it’s elements

#The elements (lists) contains the name, age & gender 

person = (['Ayaan', 5, 'Male'], ['Aaradhya', 8, 'Female'])

#printing the tuple

print(person)

Output [1]: (['Ayaan', 5, 'Male'], ['Aaradhya', 8, 'Female'])

#printing the location of the object created in the memory address in hexadecimal format

print(hex(id(person)))

Output [2]: 0x1691ef47f88

#Changing the age for the 1st element. Selecting 1st element of tuple by using indexing [0] then 2nd element of the list by using indexing [1] and assigning a new value for age as 4

person[0][1] = 4

#printing the updated tuple

print(person)

Output [3]: (['Ayaan', 4, 'Male'], ['Aaradhya', 8, 'Female'])

#printing the location of the object created in the memory address in hexadecimal format

print(hex(id(person)))

Output [4]: 0x1691ef47f88

In the above code, you can see that the object ‘person’ is immutable since it is a type of tuple. However, it has two lists as it’s elements, and we can change the state of lists (lists being mutable). So, here we did not change the object reference inside the Tuple, but the referenced object was mutated.

Also Read: Real-Time Object Detection Using TensorFlow

Same way, let’s explore how it will behave if you have a mutable object which contains an immutable object? Let us again use a code to understand the behaviour–

#creating a list (mutable object) which contains tuples(immutable) as it’s elements

list1 = [(1, 2, 3), (4, 5, 6)]

#printing the list

print(list1)

Output [1]: [(1, 2, 3), (4, 5, 6)]

#printing the location of the object created in the memory address in hexadecimal format

print(hex(id(list1)))

Output [2]: 0x1691d5b13c8	

#changing object reference at index 0

list1[0] = (7, 8, 9)

#printing the list

Output [3]: [(7, 8, 9), (4, 5, 6)]

#printing the location of the object created in the memory address in hexadecimal format

print(hex(id(list1)))

Output [4]: 0x1691d5b13c8

As an individual, it completely depends upon you and your requirements as to what kind of data structure you would like to create with a combination of mutable & immutable objects. I hope that this information will help you while deciding the type of object you would like to select going forward.

Before I end our discussion on IMMUTABILITY, allow me to use the word ‘CAVITE’ when we discuss the String and Integers. There is an exception, and you may see some surprising results while checking the truthiness for immutability. For instance:
#creating an object of integer type with value 10 and reference variable name ‘x’ 

x = 10
 

#printing the value of ‘x’

print(x)

Output [1]: 10

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(x)))

Output [2]: 0x538fb560

#creating an object of integer type with value 10 and reference variable name ‘y’

y = 10

#printing the value of ‘y’

print(y)

Output [3]: 10

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(y)))

Output [4]: 0x538fb560

As per our discussion and understanding, so far, the memory address for x & y should have been different, since, 10 is an instance of Integer class which is immutable. However, as shown in the above code, it has the same memory address. This is not something that we expected. It seems that what we have understood and discussed, has an exception as well.

Quick checkPython Data Structures

Immutability of Tuple

Tuples are immutable and hence cannot have any changes in them once they are created in Python. This is because they support the same sequence operations as strings. We all know that strings are immutable. The index operator will select an element from a tuple just like in a string. Hence, they are immutable.

Exceptions in immutability

Like all, there are exceptions in the immutability in python too. Not all immutable objects are really mutable. This will lead to a lot of doubts in your mind. Let us just take an example to understand this.

Consider a tuple ‘tup’.

Now, if we consider tuple tup = (‘GreatLearning’,[4,3,1,2]) ;

We see that the tuple has elements of different data types. The first element here is a string which as we all know is immutable in nature. The second element is a list which we all know is mutable. Now, we all know that the tuple itself is an immutable data type. It cannot change its contents. But, the list inside it can change its contents. So, the value of the Immutable objects cannot be changed but its constituent objects can. change its value.

FAQs

1. Difference between mutable vs immutable in Python?

Mutable ObjectImmutable Object
State of the object can be modified after it is created.State of the object can’t be modified once it is created.
They are not thread safe.They are thread safe
Mutable classes are not final.It is important to make the class final before creating an immutable object.

2. What are the mutable and immutable data types in Python?

  • Some mutable data types in Python are:

list, dictionary, set, user-defined classes.

  • Some immutable data types are: 

int, float, decimal, bool, string, tuple, range.

3. Are lists mutable in Python?

Lists in Python are mutable data types as the elements of the list can be modified, individual elements can be replaced, and the order of elements can be changed even after the list has been created.
(Examples related to lists have been discussed earlier in this blog.)

4. Why are tuples called immutable types?

Tuple and list data structures are very similar, but one big difference between the data types is that lists are mutable, whereas tuples are immutable. The reason for the tuple’s immutability is that once the elements are added to the tuple and the tuple has been created; it remains unchanged.

A programmer would always prefer building a code that can be reused instead of making the whole data object again. Still, even though tuples are immutable, like lists, they can contain any Python object, including mutable objects.

5. Are sets mutable in Python?

A set is an iterable unordered collection of data type which can be used to perform mathematical operations (like union, intersection, difference etc.). Every element in a set is unique and immutable, i.e. no duplicate values should be there, and the values can’t be changed. However, we can add or remove items from the set as the set itself is mutable.

6. Are strings mutable in Python?

Strings are not mutable in Python. Strings are a immutable data types which means that its value cannot be updated.

Join Great Learning Academy’s free online courses and upgrade your skills today.


Original article source at: https://www.mygreatlearning.com

#python 

田辺  桃子

田辺 桃子

1680009431

如何使用 Python 处理 CSV

在本教程中,我们将探索使用 Python 标准库“csv”读取、写入和编辑 CSV(逗号分隔值)文件的方法。

由于用于数据库的 CSV 文件的流行,这些方法将证明对不同工作领域的程序员至关重要。

CSV 文件未标准化。无论如何,在各种 CSV 文件中都可以看到一些常见的结构。在大多数情况下,CSV 文件的第一行是为文件列的标题保留的。

每行后面的行形成一行数据,其中字段按照与第一行匹配的顺序排序。顾名思义,数据值通常用逗号分隔,但也可以使用其他分隔符。

最后,当在字段中使用关键字符时,某些 CSV 文件将使用双引号。

本教程中使用的所有示例都将基于以下虚拟数据文件:  basic.csv、  multiple_delimiters.csv和 new_delimiter.csv

读取 CSV(带标题或不带标题)

首先,我们将检查最简单的情况:读取整个 CSV 文件并打印读取的每个项目。

import csv
path = "data/basic.csv"
with open(path, newline='') as csvfile:
   reader = csv.reader(csvfile)
   for row in reader:
      for col in row:
         print(col,end=" ")
      print()

让我们分解这段代码。处理 CSV 文件所需的唯一库是“csv”Python 库。导入库并设置 CSV 文件的路径后,我们使用“open()”方法开始逐行读取文件。

CSV 文件的解析由稍后详细讨论的“csv.reader()”方法处理。

我们的 CSV 文件的每一行都将作为字符串列表返回,您可以按照您喜欢的任何方式处理这些字符串。这是上面代码的输出:

通常在实践中,我们不希望存储 CSV 文件的列标题。将标头存储在 CSV 的第一行是标准的。

幸运的是,“csv.reader()”跟踪在“line_num”对象中读取了多少行。使用这个对象,我们可以简单地跳过 CSV 文件的第一行。

import csv
path = "data/basic.csv"
with open(path, newline='') as csvfile:
reader = csv.reader(csvfile)
   for row in reader:
   if(reader.line_num != 1):
      for col in row:
         print(col,end=" ")
      print()

CSV 阅读器编码

在上面的代码中,我们创建了一个名为“reader”的对象,它被赋予了“csv.reader()”返回的值。

reader = csv.reader(csvfile)

“csv.reader()”方法采用一些有用的参数。我们将只关注两个:“delimiter”参数和“quotechar”。默认情况下,这些参数采用值“,”和“”'。

我们将在下一节讨论定界符参数。

“quotechar”参数是单个字符,用于定义具有特殊字符的字段。在我们的例子中,我们所有的头文件都有这些引号字符。

这允许我们在标题“Favorite Color”中包含一个空格字符。如果我们将“quotechar”更改为“|”,请注意结果如何变化 象征。

import csv
path = "data/basic.csv"
with open(path, newline='') as csvfile:
   reader = csv.reader(csvfile, quotechar='|')
   for row in reader:
      if(reader.line_num != 0):
      for col in row:
         print(col,end="\t")
      print()

将“quotechar”从“””更改为“|” 导致标题周围出现双引号。

阅读单列

使用我们上面的方法从 CSV 中读取单个列很简单。我们的行元素是一个包含列元素的列表。

因此,我们不会打印出整行,而是只会打印出每一行中所需的列元素。对于我们的示例,我们将打印出第二列。

import csv
path = "data/basic.csv"
with open(path, newline='') as csvfile:
   reader = csv.reader(csvfile, delimiter=',')
   for row in reader:
      print(row[1])

CSV 自定义分隔符

CSV 文件经常使用“,”符号来区分数据值。事实上,逗号是 csv.reader() 方法的默认分隔符。

但在实践中,数据文件可能会使用其他符号来区分数据值。例如,检查使用“;”的 CSV 文件(称为 new_delimiter.csv)的内容 分隔数据值。

如果我们改变“csv.reader()”方法的“delimiter”参数,将这个 CSV 文件读入 Python 很简单。

reader = csv.reader(csvfile, delimiter=';')

请注意我们如何将定界符参数从“,”更改为“;”。通过这个简单的更改,“csv.reader()”方法将按预期解析我们的 CSV 文件。

import csv
path = "data/new_delimiter.csv"
with open(path, newline='') as csvfile:
   reader = csv.reader(csvfile, delimiter=';')
   for row in reader:
      if(reader.line_num != 0):
      for col in row:
         print(col,end="\t")
      print()

具有多个分隔符的 CSV

python 中的标准 CSV 包无法处理多个分隔符。为了处理这种情况,我们将使用标准包“re”。

以下示例解析 CSV 文件“multiple_delimiters.csv”。查看“multiple_delimters.csv”中的数据结构,我们看到标题用逗号分隔,其余行用逗号、竖线和文本“Delimiter”分隔。

完成所需解析的核心函数是“re.split()”方法,它将两个字符串作为参数:一个高度结构化的字符串表示分隔符和一个要在这些分隔符处拆分的字符串。首先,让我们看看代码和输出。

import re
path = "data/multiple_delimiters.csv"
with open(path, newline='') as csvfile:
   for row in csvfile:
      row = re.split('Delimiter|[|]|,|\n', row)
      for field in row:
         print(field, end='\t')
      print()

这段代码的关键部分是“re.split()”的第一个参数。

 'Delimiter|[|]|,|\n'

每个分割点之间用符号“|”隔开。由于这个符号在我们的文本中也是一个分隔符,所以我们必须在它周围加上括号来转义这个字符。

最后,我们将“\n”字符作为分隔符,这样换行符就不会包含在每一行的最后一个字段中。要了解这一点的重要性,请检查不包含“\n”作为分割点的结果。

import re
path = "data/multiple_delimiters.csv"
with open(path, newline='') as csvfile:
   for row in csvfile:
      row = re.split('Delimiter|[|]|,', row)
      for field in row:
         print(field, end='\t')
      print()

注意我们输出的每一行之间的额外间距。

写入 CSV 文件

写入 CSV 文件将遵循与我们读取文件的方式类似的结构。但是,我们将使用“csv”中的“writer”对象来写入数据,而不是打印数据。

首先,我们将做最简单的示例:创建一个 CSV 文件并在其中写入标题和一些数据。

import csv
path = "data/write_to_file.csv"
with open(path, 'w', newline='') as csvfile:
   writer = csv.writer(csvfile)
   writer.writerow(['h1'] + ['h2'] + ['h3'])
   i = 0
   while i < 5:
      writer.writerow([i] + [i+1] + [i+2])
      i = i+1

在此示例中,我们使用“csv.writer()”方法实例化“writer”对象。这样做之后,只需调用“writerow()”方法即可将字符串列表写入我们文件的下一行,默认分隔符“,”位于每个字段元素之间。

编辑现有 CSV 文件的内容需要以下步骤:读入 CSV 文件数据、编辑列表(更新信息、追加新信息、删除信息),然后将新数据写回 CSV 文件。

对于我们的示例,我们将编辑在上一节“write_to_file.csv”中创建的文件。

我们的目标是将第一行数据的值加倍,删除第二行,并在文件末尾追加一行数据。

import csv
path = "data/write_to_file.csv"
#Read in Data
rows = []
with open(path, newline='') as csvfile:
   reader = csv.reader(csvfile)
   for row in reader:
      rows.append(row)
#Edit the Data
rows[1] = ['0','2','4']
del rows[2]
rows.append(['8','9','10'])
#Write the Data to File
with open(path, 'w', newline='') as csvfile:
   writer = csv.writer(csvfile)
   writer.writerows(rows)

使用前面部分讨论的技术,我们读取数据并将列表存储在一个名为“行”的变量中。由于所有元素都是 Python 列表,我们使用标准列表方法进行了编辑。

我们以与之前相同的方式打开文件。写入时的唯一区别是我们使用“writerows()”方法而不是“writerow()”方法。

搜索和替换 CSV 文件

通过上一节中讨论的过程,我们创建了一种搜索和替换 CSV 文件的自然方式。在上面的示例中,我们将 CSV 文件的每一行读入称为“行”的列表列表中。

由于“行”是一个列表对象,我们可以使用Python 的列表方法在将 CSV 文件写回文件之前对其进行编辑。我们在示例中使用了一些列表方法,但另一个有用的方法是“list.replace()”方法,它有两个参数:第一个是要找到的字符串,然后是用于替换找到的字符串的字符串。

例如,要将所有“3”替换为“10”,我们可以这样做

for row in rows:
   row = [field.replace('3','10') for field in row]

同样,如果数据作为字典对象导入(稍后讨论),我们可以使用 Python 的字典方法在重新写入文件之前编辑数据。

Python 字典到 CSV (DictWriter)

Python 的“csv”库还提供了一种将字典写入 CSV 文件的便捷方法。

import csv
Dictionary1 = {'header1': '5', 'header2': '10', 'header3': '13'}
Dictionary2 = {'header1': '6', 'header2': '11', 'header3': '15'}
Dictionary3 = {'header1': '7', 'header2': '18', 'header3': '17'}
Dictionary4 = {'header1': '8', 'header2': '13', 'header3': '18'}
path = "data/write_to_file.csv"
with open(path, 'w', newline='') as csvfile:
   headers = ['header1', 'header2', 'header3']
   writer = csv.DictWriter(csvfile, fieldnames=headers)
   writer.writeheader()
   writer.writerow(Dictionary1)
   writer.writerow(Dictionary2)
   writer.writerow(Dictionary3)
   writer.writerow(Dictionary4)

在此示例中,我们有四个具有相同键的字典。密钥必须与 CSV 文件中所需的标头名称匹配,这一点至关重要。

由于我们将输入我们的行作为字典对象,我们使用“csv.DictWriter()”方法实例化我们的 writer 对象并指定我们的标题。

完成后,只需调用“writerow()”方法即可开始写入我们的 CSV 文件。

CSV 到 Python 字典 (DictReader)

CSV 库还提供了一种直观的“csv.DictReader()”方法,可将 CSV 文件中的行输入到字典对象中。这是一个简单的例子。

import csv
path = "data/basic.csv"
with open(path, newline='') as csvfile:
   reader = csv.DictReader(csvfile, delimiter=',')
   for row in reader:
      print(row)

正如我们在输出中看到的,每一行都存储为一个字典对象。

拆分大型 CSV 文件

如果我们希望将一个大的 CSV 文件拆分成多个较小的 CSV 文件,我们使用以下步骤:将文件作为行列表输入,将前半部分行写入一个文件,将后半部分行写入另一个文件。

这是一个简单的示例,我们将“basic.csv”转换为“basic_1.csv”和“basic_2.csv”。

import csv
path = "data/basic.csv"
#Read in Data
rows = []
with open(path, newline='') as csvfile:
   reader = csv.reader(csvfile)
   for row in reader:
      rows.append(row)
Number_of_Rows = len(rows)
#Write Half of the Data to a File
path = "data/basic_1.csv"
with open(path, 'w', newline='') as csvfile:
   writer = csv.writer(csvfile)
   writer.writerow(rows[0]) #Header
   for row in rows[1:int((Number_of_Rows+1)/2)]:
      writer.writerow(row)
#Write the Second Half of the Data to a File
path = "data/basic_2.csv"
with open(path, 'w', newline='') as csvfile:
   writer = csv.writer(csvfile)
   writer.writerow(rows[0]) #Header
   for row in rows[int((Number_of_Rows+1)/2):]:
      writer.writerow(row)

basic_1.csv:

basic_2.csv:

在这些例子中,没有使用新的方法。相反,我们有两个单独的 while 循环来处理写入两个 CSV 文件的前半部分和后半部分。

文章原文出处:https: //likegeeks.com/

#python #csv 

Desmond  Gerber

Desmond Gerber

1680009180

How to CSV Processing using Python

Throughout this tutorial, we will explore methods for reading, writing, and editing CSV (Comma-Separated Values) files using the Python standard library “csv”.

Due to the popularity of CSV files for databasing, these methods will prove crucial to programmers across different fields of work.

CSV files are not standardized. Regardless, there are some common structures seen in all sorts of CSV files. In most cases, the first line of a CSV file is reserved for the headers of the columns of the files.

The lines following each form a row of the data where the fields are sorted in the order matching the first row. As the name suggests, data values are usually separated by a comma, however, other delimiters can be used.

Lastly, some CSV files will use double quotes when key characters are being used within a field.

All the examples used throughout this tutorial will be based on the following dummy data files: basic.csvmultiple_delimiters.csv, and new_delimiter.csv.

Read CSV (With Header or Without)

First, we will examine the simplest case: reading an entire CSV file and printing each item read in.

import csv
path = "data/basic.csv"
with open(path, newline='') as csvfile:
   reader = csv.reader(csvfile)
   for row in reader:
      for col in row:
         print(col,end=" ")
      print()

Let us break down this code. The only library needed to work with CSV files is the “csv” Python library. After importing the library and setting the path of our CSV file, we use the “open()” method to begin reading the file line by line.

The parsing of the CSV file is handled by the “csv.reader()” method which is discussed in detail later.

Each row of our CSV file will be returned as a list of strings that can be handled in any way you please. Here is the output of the code above:

Frequently in practice, we do not wish to store the headers of the columns of the CSV file. It is standard to store the headers on the first line of the CSV.

Luckily, “csv.reader()” tracks how many lines have been read in the “line_num” object. Using this object, we can simply skip the first line of the CSV file.

import csv
path = "data/basic.csv"
with open(path, newline='') as csvfile:
reader = csv.reader(csvfile)
   for row in reader:
   if(reader.line_num != 1):
      for col in row:
         print(col,end=" ")
      print()

CSV Reader Encoding

In the code above, we create an object called “reader” which is assigned the value returned by “csv.reader()”.

reader = csv.reader(csvfile)

The “csv.reader()” method takes a few useful parameters. We will only focus on two: the “delimiter” parameter and the “quotechar”. By default, these parameters take the values “,” and ‘”‘.

We will discuss the delimiter parameter in the next section.

The “quotechar” parameter is a single character that is used to define fields with special characters. In our example, all our header files have these quote characters around them.

This allows us to include a space character in the header “Favorite Color”. Notice how the result changes if we change our “quotechar” to the “|” symbol.

import csv
path = "data/basic.csv"
with open(path, newline='') as csvfile:
   reader = csv.reader(csvfile, quotechar='|')
   for row in reader:
      if(reader.line_num != 0):
      for col in row:
         print(col,end="\t")
      print()

Changing the “quotechar” from ‘”‘ to “|” resulted in the double quotes appearing around the headers.

Reading a Single Column

Reading a single column from a CSV is simple using our method above. Our row elements are a list containing the column elements.

Therefore, instead of printing out the entire row, we will only print out the desired column element from each row. For our example, we will print out the second column.

import csv
path = "data/basic.csv"
with open(path, newline='') as csvfile:
   reader = csv.reader(csvfile, delimiter=',')
   for row in reader:
      print(row[1])

CSV Custom Delimiter

CSV files frequently use the “,” symbol to distinguish between data values. In fact, the comma symbol is the default delimiter for the csv.reader() method.

In practice though, data files may use other symbols to distinguish between data values. For example, examine the contents of a CSV file (called new_delimiter.csv) which uses “;” to delimit between data values.

Reading in this CSV file to Python is simple if we alter the “delimiter” parameter of the “csv.reader()” method.

reader = csv.reader(csvfile, delimiter=';')

Notice how we changed the delimiter argument from “,” to “;”. The “csv.reader()” method will parse our CSV file as expected with this simple change.

import csv
path = "data/new_delimiter.csv"
with open(path, newline='') as csvfile:
   reader = csv.reader(csvfile, delimiter=';')
   for row in reader:
      if(reader.line_num != 0):
      for col in row:
         print(col,end="\t")
      print()

CSV with Multiple Delimiters

The standard CSV package in python cannot handle multiple delimiters. In order to deal with such cases, we will use the standard package “re”.

The following example parses the CSV file “multiple_delimiters.csv”. Looking at the structure of the data in “multiple_delimters.csv”, we see the headers are delimited with commas and the remaining rows are delimited with a comma, a vertical bar, and the text “Delimiter”.

The core function to accomplishing the desired parsing is the “re.split()” method which will take two strings as arguments: a highly structured string denoting the delimiters and a string to be split at those delimiters. First, let us see the code and output.

import re
path = "data/multiple_delimiters.csv"
with open(path, newline='') as csvfile:
   for row in csvfile:
      row = re.split('Delimiter|[|]|,|\n', row)
      for field in row:
         print(field, end='\t')
      print()

The key component of this code is the first parameter of “re.split()”.

 'Delimiter|[|]|,|\n'

Each split point is separated by the symbol “|”. Since this symbol is also a delimiter in our text, we must put brackets around it to escape the character.

Lastly, we put the “\n” character as a delimiter so that the newline will not be included in the final field of each row. To see the importance of this, examine the result without “\n” included as a split point.

import re
path = "data/multiple_delimiters.csv"
with open(path, newline='') as csvfile:
   for row in csvfile:
      row = re.split('Delimiter|[|]|,', row)
      for field in row:
         print(field, end='\t')
      print()

Notice the extra spacing between each row of our output.

Writing to a CSV File

Writing to a CSV file will follow a similar structure to how we read the file. However, instead of printing the data, we will use the “writer” object within “csv” to write the data.

First, we will do the simplest example possible: creating a CSV file and writing a header and some data in it.

import csv
path = "data/write_to_file.csv"
with open(path, 'w', newline='') as csvfile:
   writer = csv.writer(csvfile)
   writer.writerow(['h1'] + ['h2'] + ['h3'])
   i = 0
   while i < 5:
      writer.writerow([i] + [i+1] + [i+2])
      i = i+1

In this example, we instantiate the “writer” object with the “csv.writer()” method. After doing so, simply calling the “writerow()” method will write the list of strings onto the next row in our file with the default delimiter “,” placed between each field element.

Editing the contents of an existing CSV file will require the following steps: read in the CSV file data, edit the lists (Update information, append new information, delete information), and then write the new data back to the CSV file.

For our example, we will be editing the file created in the last section “write_to_file.csv”.

Our goal will be to double the values of the first row of data, delete the second row, and append a row of data at the end of the file.

import csv
path = "data/write_to_file.csv"
#Read in Data
rows = []
with open(path, newline='') as csvfile:
   reader = csv.reader(csvfile)
   for row in reader:
      rows.append(row)
#Edit the Data
rows[1] = ['0','2','4']
del rows[2]
rows.append(['8','9','10'])
#Write the Data to File
with open(path, 'w', newline='') as csvfile:
   writer = csv.writer(csvfile)
   writer.writerows(rows)

Using the techniques discussed in the prior sections, we read the data and stored the lists in a variable called “rows”. Since all the elements were Python lists, we made the edits using standard list methods.

We opened the file in the same manner as before. The only difference when writing was our use of the “writerows()” method instead of the “writerow()” method.

Search & Replace CSV File

We have created a natural way to search and replace a CSV file through the process discussed in the last section. In the example above, we read each line of the CSV file into a list of lists called “rows”.

Since “rows” is a list object, we can use Pythons list methods to edit our CSV file before writing it back to a file. We used some list methods in the example, but another useful method is the “list.replace()” method which takes two arguments: first a string to be found, and then the string to replace the found string with.

For example, to replace all ‘3’s with ’10’s we could have done

for row in rows:
   row = [field.replace('3','10') for field in row]

Similarly, if the data is imported as a dictionary object (as discussed later), we can use Python’s dictionary methods to edit the data before re-writing to the file.

Python Dictionary to CSV (DictWriter)

Pythons “csv” library also provides a convenient method for writing dictionaries into a CSV file.

import csv
Dictionary1 = {'header1': '5', 'header2': '10', 'header3': '13'}
Dictionary2 = {'header1': '6', 'header2': '11', 'header3': '15'}
Dictionary3 = {'header1': '7', 'header2': '18', 'header3': '17'}
Dictionary4 = {'header1': '8', 'header2': '13', 'header3': '18'}
path = "data/write_to_file.csv"
with open(path, 'w', newline='') as csvfile:
   headers = ['header1', 'header2', 'header3']
   writer = csv.DictWriter(csvfile, fieldnames=headers)
   writer.writeheader()
   writer.writerow(Dictionary1)
   writer.writerow(Dictionary2)
   writer.writerow(Dictionary3)
   writer.writerow(Dictionary4)

In this example, we have four dictionaries with the same keys. It is crucial that the keys match the header names you want in the CSV file.

Since we will be inputting our rows as dictionary objects, we instantiate our writer object with the “csv.DictWriter()” method and specify our headers.

After this is done, it is as simple as calling the “writerow()” method to begin writing to our CSV file.

CSV to Python Dictionary (DictReader)

The CSV library also provides an intuitive “csv.DictReader()” method which inputs the rows from a CSV file into a dictionary object. Here is a simple example.

import csv
path = "data/basic.csv"
with open(path, newline='') as csvfile:
   reader = csv.DictReader(csvfile, delimiter=',')
   for row in reader:
      print(row)

As we can see in the output, each row was stored as a dictionary object.

Split Large CSV File

If we wish to split a large CSV file into smaller CSV files, we use the following steps: input the file as a list of rows, write the first half of the rows to one file and write the second half of the rows to another.

Here is a simple example where we turn “basic.csv” into “basic_1.csv” and “basic_2.csv”.

import csv
path = "data/basic.csv"
#Read in Data
rows = []
with open(path, newline='') as csvfile:
   reader = csv.reader(csvfile)
   for row in reader:
      rows.append(row)
Number_of_Rows = len(rows)
#Write Half of the Data to a File
path = "data/basic_1.csv"
with open(path, 'w', newline='') as csvfile:
   writer = csv.writer(csvfile)
   writer.writerow(rows[0]) #Header
   for row in rows[1:int((Number_of_Rows+1)/2)]:
      writer.writerow(row)
#Write the Second Half of the Data to a File
path = "data/basic_2.csv"
with open(path, 'w', newline='') as csvfile:
   writer = csv.writer(csvfile)
   writer.writerow(rows[0]) #Header
   for row in rows[int((Number_of_Rows+1)/2):]:
      writer.writerow(row)

basic_1.csv:

basic_2.csv:

In these examples, no new methods were used. Instead, we had two separate while loops for handling the first and second half of writing to the two CSV files.

Original article source at: https://likegeeks.com/

#python #csv