David Nano

David Nano

1567431672

File management with python

Most of us have had that one experience where we had a ton of dis-organized files in our machines. It happens. One minute, you're opening a large zip file, the next thing you know, the files are everywhere in the directory, mixing with all your important files and randomly placed leaving you with the task of manually sorting what needs to go where. It's a real pain. To ease this process, we're going to delve into file management with python the smart way.

Work smart, not hard.

Let's begin. We'll be using python 3.4 or greater. 

Assuming you've got python up and running already, we're going to take a walk with the OS module and a few others we will introduce along the way. Most of these come with python, so there's no need to install anything else to follow along.

Creating random files

Create a directory to work with. Call it ManageFiles. Inside this folder create another folder RandomFiles. Your directory structure should now look like this:

ManageFiles/
 |
 |_RandomFiles/

We're going to create random files to play with in the RandomFiles directory

Create a file createrandomfiles.py inside ManageFiles directory. You now have this:

ManageFiles/
 |
 |_ create_random_files.py
 |_RandomFiles/

Done? Now get in the following code, we'll get into its details in a moment.

import os
from pathlib import Path
import random

list_of_extensions = [‘.rst’,‘.txt’,‘.md’,‘.docx’,‘.odt’,‘.html’,‘.ppt’,‘.doc’]

get into the RandomFiles directory

os.chdir(‘./RandomFiles’)

for item in list_of_extensions:
# create 20 random files for each file extension
for num in range(20):
# let the file begin with a random number between 1 to 50
file_name = random.randint(1,50)
file_to_create = str(file_name) + item
Path(file_to_create).touch()

As of python 3.4, we 've got pathlib, our little magic box. We also import python’s random function for creating random numbers; Hold on to that thought, we’re going to cover it when as we get to the line that uses it.

First off, we create a list of file extensions from where we will get our random files. Feel free to add to it. 

Next up, we change to the RandomFiles directory, then comes our loop, so here goes.

We are simply saying, take each item in this list_of_extensions and do the following to it. Let’s take the .txt for instance. We get into another loop, where to this .txt, we do something to it 20 times.

Remember our import of random? We use it to select a random number between 1 and 50 for our file. In short, what this little loop does is save us, the less creative lot(don’t worry, I’m part of this crew), the time of naming random files. We will simply create a file say 23.txt or 14.txt, provided it falls within our range of 50, twenty times. This is just so as to create a mess large enough to give pain when moving manually. The same process will be done with the other extensions. Next? Run this in your terminal.

python create_random_files.py

Congratulations! We now have a mess of a directory. Now to clean it up.

In the same location where our create_random_files.py is, create a file clean_up.py and get the below in.

Method 1:

import os
import shutil
import glob

get into the RandomFiles directory

os.chdir(‘./RandomFiles’)

get the list of files in the directory RandomFiles

files_to_group = []
for random_file in os.listdir(‘.’):
files_to_group.append(random_file)

get all the file extensions present

file_extensions = []
for our_file in files_to_group:
file_extensions.append(os.path.splitext(our_file)[1])

print(set(file_extensions))

file_types = set(file_extensions)

for type in file_types:
new_directory = type.replace(“.”, " ")
os.mkdir(new_directory) # create directory with given name

for fname in glob.glob(f'*.{type[1:]}'):
    shutil.move(fname, new_directory)

For this, we import two new libraries; shutil and glob. The shutil will help us move our files while the glob will help find the files to classify. Just like before, this will all become clear as we get to the line.

First off, we get a list of all the files in the directory.

Here, we assume that we do not have a clue of what files are in the directory. This means unlike where you can get all the extensions present manually and use if statements or switch, we want the program to look through the directory and do this for us. What if the file had dozens of extensions or log files? Would you do this manually?

Once we get a list of all the files in the folder, we get into another loop, to get the file extensions of these files.

Notice how we use:

os.path.splitext(our_file)[1]

Currently, the our_file variable looks something like this 5.docx (for instance). When we split it, we get this:

('5', '.docx')

we then get the index [1] from it which in turn takes .docx since 5 is index [0].

So we now have the list of all file extensions present in the folder, whether repeated or not.

To make it non-repetitive, we make a set. This takes all the items from the list and gets only the unique items. In our case, if we had a list where we had an extension say .docx repeating itself over and over in the set would ensure we had only one of it.

# create a set and assign it to a variable
file_types = set(file_extensions)

Remember our list of file types still has the . for every file extension. This would mean if we were to create a folder named exactly the same way, we would end up creating hidden folders and that is something we do not want.

So, as we loop over this set, we create a directory with the same extension name, only this time, we replace the . in the name with an empty string.

new_directory = type.replace(“.”, " ")

our directory would now be called ‘docx’

We still need the .docx extension to move the files.

for fname in glob.glob(f’.{type[1:]}‘)

This simply implies take any file that ends with the .docx file extension(Notice the spacing used in f’.{type[1:]}').There is no space.

The wild card * means a file can be named anything, provided it ends in .docx. Since we have already placed the period . we take the string we have and have everything else afterwards and that’s why we use [1:] which just means take from after the first character, hence take docx.

What next? Move any file with this extension into the directory named as so.

shutil.move(fname, new_directory)

In this way, once a directory for the first file found in the loop has been created, no other duplicates can be made. In short, we will not have a folder to store 5.docx and many others to store 34.docx and so on. Once we have a directory made, all other folders looking like so will move there. That’s it!

Method 2

You can alternatively, use generators. This is a fancy way of creating a list with a one liner.

import os
import shutil
import glob

get into the RandomFiles directory

os.chdir(‘./RandomFiles’)

#take every file from the directory and add to a list for all files
all_files = [x for x in os.listdir(‘.’) ]

make a set for the extensions present in the directory

file_types = set((os.path.splitext(f)[1] for f in all_files))

for ftype in file_types:
new_directory = ftype.replace(“.”, ‘’)
os.mkdir(new_directory)

for fname in glob.glob(f'*.{ftype[1:]}'):
    shutil.move(fname, new_directory)

Both of these will work. You’ve now got all your files sorted according to extension.

ManageFiles/
|
|_create_random_files.py
|_RandomFiles/
|_doc
|_docx
|_html
|_md
|_odt
|_ppt

Woosh! That was a lot.We did save some time though. Any questions? Feel free to reach out. That’s it for now.Stick around as we take it up a notch next week.

Originally published by Marvin  at dev.to

=================================================================

Thanks for reading :heart: If you liked this post, share it with all of your programming buddies! Follow me on Facebook | Twitter

Learn More

☞ Complete Python Bootcamp: Go from zero to hero in Python 3

☞ Python for Time Series Data Analysis

☞ Python Programming For Beginners From Scratch

☞ Python Network Programming | Network Apps & Hacking Tools

☞ Intro To SQLite Databases for Python Programming

☞ Ethical Hacking With Python, JavaScript and Kali Linux

☞ Beginner’s guide on Python: Learn python from scratch! (New)

☞ Python for Beginners: Complete Python Programming


#python

What is GEEK

Buddha Community

File management with python
Ray  Patel

Ray Patel

1619518440

top 30 Python Tips and Tricks for Beginners

Welcome to my Blog , In this article, you are going to learn the top 10 python tips and tricks.

1) swap two numbers.

2) Reversing a string in Python.

3) Create a single string from all the elements in list.

4) Chaining Of Comparison Operators.

5) Print The File Path Of Imported Modules.

6) Return Multiple Values From Functions.

7) Find The Most Frequent Value In A List.

8) Check The Memory Usage Of An Object.

#python #python hacks tricks #python learning tips #python programming tricks #python tips #python tips and tricks #python tips and tricks advanced #python tips and tricks for beginners #python tips tricks and techniques #python tutorial #tips and tricks in python #tips to learn python #top 30 python tips and tricks for beginners

Ray  Patel

Ray Patel

1619510796

Lambda, Map, Filter functions in python

Welcome to my Blog, In this article, we will learn python lambda function, Map function, and filter function.

Lambda function in python: Lambda is a one line anonymous function and lambda takes any number of arguments but can only have one expression and python lambda syntax is

Syntax: x = lambda arguments : expression

Now i will show you some python lambda function examples:

#python #anonymous function python #filter function in python #lambda #lambda python 3 #map python #python filter #python filter lambda #python lambda #python lambda examples #python map

Art  Lind

Art Lind

1602666000

How to Remove all Duplicate Files on your Drive via Python

Today you’re going to learn how to use Python programming in a way that can ultimately save a lot of space on your drive by removing all the duplicates.

Intro

In many situations you may find yourself having duplicates files on your disk and but when it comes to tracking and checking them manually it can tedious.

Heres a solution

Instead of tracking throughout your disk to see if there is a duplicate, you can automate the process using coding, by writing a program to recursively track through the disk and remove all the found duplicates and that’s what this article is about.

But How do we do it?

If we were to read the whole file and then compare it to the rest of the files recursively through the given directory it will take a very long time, then how do we do it?

The answer is hashing, with hashing can generate a given string of letters and numbers which act as the identity of a given file and if we find any other file with the same identity we gonna delete it.

There’s a variety of hashing algorithms out there such as

  • md5
  • sha1
  • sha224, sha256, sha384 and sha512

#python-programming #python-tutorials #learn-python #python-project #python3 #python #python-skills #python-tips

Art  Lind

Art Lind

1602968400

Python Tricks Every Developer Should Know

Python is awesome, it’s one of the easiest languages with simple and intuitive syntax but wait, have you ever thought that there might ways to write your python code simpler?

In this tutorial, you’re going to learn a variety of Python tricks that you can use to write your Python code in a more readable and efficient way like a pro.

Let’s get started

Swapping value in Python

Instead of creating a temporary variable to hold the value of the one while swapping, you can do this instead

>>> FirstName = "kalebu"
>>> LastName = "Jordan"
>>> FirstName, LastName = LastName, FirstName 
>>> print(FirstName, LastName)
('Jordan', 'kalebu')

#python #python-programming #python3 #python-tutorials #learn-python #python-tips #python-skills #python-development

August  Larson

August Larson

1624286340

Python for Beginners #2 — Importing files to python with pandas

Use pandas to upload CSV, TXT and Excel files

Story time before we begin

Learning Python isn’t the easiest thing to do. But consistency is really the key to arriving at a level that boosts your career.

We hear a lot about millennials wanting things to easy. In reality, there are a lot of young professionals who believe that they can do more for their companies but are being held back by the work cultures they are faced with at the onset of their careers.

Having been lucky enough to have found a job after my studies, I remember immediately feeling a wave of disappointment a very short while after starting my new job. I felt like a cog in a massive machine. I wasn’t really anything other than a ‘resource’. An extra 8–15 hours of daily man power depending on my boss’ whim.

The result, was the eventual disenchantment and lack of motivation simply because, for the most part, I was expected to be quiet and do my job in the hope of one day being senior enough to effect significant changes. And while the older generation would generally tell me to suck it up, I couldn’t see myself sucking it up for 5 years or more. I knew I’d get stale and afraid of change, much like those telling me to stay in my place.

For anyone in a similar situation,**_ do your best to improve on your skills _**and find an environment that works for you. That’s the whole purpose of these articles. To get you on your way to freedom.

Introduction

For this demonstration, I’ll use data from this Kaggle competition. It’s a simple CSV file containing data on individuals in the Titanic and the different profiles i.e. (age, marital status etc.)

I want to import this file to python. I’ll show you how to do this alongside all the possible troubleshoots you may encounter.

Table of Contents

  1. Where should you put your files?
  2. Reading CSV and TXT files
  3. Reading excel (XLSX) files

#python #programming #pandas #python for beginners #importing files to python with pandas #python for beginners #2 — importing files to python with pandas