Edward Jackson

Edward Jackson

1550111195

Proper way of reading in files from a directory using Python 2.6 in bash shell

I am trying to read in files for text processing, and the idea is to run them through Hadoop pseudo distributed file system on my virtual machine, using map-reduce code I am writing. The interface is Ubuntu Linux, I am running Python 2.6 with the installation. I need to use sys.stdin for reading in the files, and sys.stdout so I pass from mapper to reducer. So here is my test code for the mapper:

#!/usr/bin/env python

import sys
import string
import glob
import os

files = glob.glob(sys.stdin)
for file in files:
with open(file) as infile:
txt = infile.read()
txt = txt.split()
print(txt)

I’m not sure how glob works with sys.stdin, but this is not working. I get the following errors:

After testing with piping:

[training@localhost data]$ cat test | ./mapper.py

I get this:

cat: test: Is a directory
Traceback (most recent call last):
File “./mapper.py”, line 8, in <module>
files = glob.glob(sys.stdin)
File “/usr/lib64/python2.6/glob.py”, line 16, in glob
return list(iglob(pathname))
File “/usr/lib64/python2.6/glob.py”, line 24, in iglob
if not has_magic(pathname):
File “/usr/lib64/python2.6/glob.py”, line 78, in has_magic
return magic_check.search(s) is not None
TypeError: expected string or buffer

For the moment I am just trying to read in three small .txt files in one directory.

Thanks!

#python #bash #hadoop

What is GEEK

Buddha Community

Valerio Tana

1550130008

Still I do not fully understand what is your expected output (list or plain text), the following would work:

#!/usr/bin/env python

import sys, glob

dir = sys.stdin.read().rstrip('\r\n')
files = glob.glob(dir + '/*')
for file in files:
    with open(file) as infile:
        txt = infile.read()
        txt = txt.split()
    print(txt)

Then execute with:

echo "test" | ./mapper.py

My recommendation is to feed the directory name via the command line argument, not via the stdin as above.

If you want to tweak the format of the output, please let me know. Hope this helps.

Ray  Patel

Ray Patel

1619518440

top 30 Python Tips and Tricks for Beginners

Welcome to my Blog , In this article, you are going to learn the top 10 python tips and tricks.

1) swap two numbers.

2) Reversing a string in Python.

3) Create a single string from all the elements in list.

4) Chaining Of Comparison Operators.

5) Print The File Path Of Imported Modules.

6) Return Multiple Values From Functions.

7) Find The Most Frequent Value In A List.

8) Check The Memory Usage Of An Object.

#python #python hacks tricks #python learning tips #python programming tricks #python tips #python tips and tricks #python tips and tricks advanced #python tips and tricks for beginners #python tips tricks and techniques #python tutorial #tips and tricks in python #tips to learn python #top 30 python tips and tricks for beginners

Edward Jackson

Edward Jackson

1550111195

Proper way of reading in files from a directory using Python 2.6 in bash shell

I am trying to read in files for text processing, and the idea is to run them through Hadoop pseudo distributed file system on my virtual machine, using map-reduce code I am writing. The interface is Ubuntu Linux, I am running Python 2.6 with the installation. I need to use sys.stdin for reading in the files, and sys.stdout so I pass from mapper to reducer. So here is my test code for the mapper:

#!/usr/bin/env python

import sys
import string
import glob
import os

files = glob.glob(sys.stdin)
for file in files:
with open(file) as infile:
txt = infile.read()
txt = txt.split()
print(txt)

I’m not sure how glob works with sys.stdin, but this is not working. I get the following errors:

After testing with piping:

[training@localhost data]$ cat test | ./mapper.py

I get this:

cat: test: Is a directory
Traceback (most recent call last):
File “./mapper.py”, line 8, in <module>
files = glob.glob(sys.stdin)
File “/usr/lib64/python2.6/glob.py”, line 16, in glob
return list(iglob(pathname))
File “/usr/lib64/python2.6/glob.py”, line 24, in iglob
if not has_magic(pathname):
File “/usr/lib64/python2.6/glob.py”, line 78, in has_magic
return magic_check.search(s) is not None
TypeError: expected string or buffer

For the moment I am just trying to read in three small .txt files in one directory.

Thanks!

#python #bash #hadoop

Ray  Patel

Ray Patel

1619510796

Lambda, Map, Filter functions in python

Welcome to my Blog, In this article, we will learn python lambda function, Map function, and filter function.

Lambda function in python: Lambda is a one line anonymous function and lambda takes any number of arguments but can only have one expression and python lambda syntax is

Syntax: x = lambda arguments : expression

Now i will show you some python lambda function examples:

#python #anonymous function python #filter function in python #lambda #lambda python 3 #map python #python filter #python filter lambda #python lambda #python lambda examples #python map

I am Developer

1620616862

How to Delete Directories and Files in Linux using Command Line

In this remove or delete directories and files linux tutorial guide, you will learn how to remove empty directory and non empty directory linux using command line. And as well as how to remove/file files linux using command line.

If you work with Linux then you will need the following:

  • how to remove empty directory in linux,
  • how to remove non empty directory,
  • how to remove directory without confirmation linux
  • how to remove files with and without confirmation in linux.

So, this tutorial guide will show you you how to use the rmunlink, and rmdir commands to remove or delete files and directories in Linux with and without confirmation.

https://www.tutsmake.com/how-to-remove-directories-and-files-using-linux-command-line/

#how to delete directory in linux #how to remove non empty directory in linux #remove all files in a directory linux #linux delete all files in current directory #linux delete all files in a directory recursively #delete all files in a directory linux

Ray  Patel

Ray Patel

1623225360

How to Run bash scripts Using Python?

If you are using Linux, then you would definitely love the shell commands.

And if you are working with Python, then you may have tried to automate things. That’s a way to save time. You may also have some bash scripts to automate things.

Python is handy to write scripts than bash. And managing Python scripts are easy compared to bash scripts. You will find it difficult to maintain the bash scripts once it’s growing.

But what if you already have bash scripts that you want to run using Python?

Is there any way to execute the bash commands and scripts in Python?

Yeah, Python has a built-in module called subprocess which is used to execute the commands and scripts inside Python scripts. Let’s see how to execute bash commands and scripts in Python scripts in detail.

#development #python #how to run bash scripts using python #shell script from python #bash script #shell=