Read Multiple Text File in the Same Time Python
The All-time Practice of Reading Text Files In Python
Combine multiple files into a unmarried stream with richer metadata
Reading text files in Python is relatively easy to compare with nearly of the other programming languages. Usually, nosotros merely utilise the "open()" office with reading or writing mode and then offset to loop the text files line by line.
This is already the best practise and information technology cannot exist any easie r means. However, when we desire to read content from multiple files, in that location is definitely a better way. That is, using the "File Input" module that is born to Python. It combines the content from multiple files that allow usa to procedure everything in a unmarried for-loop and plenty of other benefits.
In this article, I'll demonstrate this module with examples.
0. Without the FileInput Module
Let'due south take a look at the "ordinary" mode of reading multiple text files using the open()
part. Just before that, we need to create ii sample files for sit-in purpose.
with open('my_file1.txt', style='w') equally f:
f.write('This is line ane-1\north')
f.write('This is line one-2\n') with open up('my_file2.txt', mode='w') as f:
f.write('This is line ii-i\n')
f.write('This is line 2-2\n')
In the above code, we open a file with the mode w
which means "write". And so, we write two lines in the file. Please be noticed that we need to add the new line \n
. Otherwise, the 2 sentences will be written in a single line.
After that, we should have two text files in the current working directory.
Now, let's say we desire to read from both the text files and print the content line by line. Of course, we can still practice that use the open()
function.
# Iterate through all file
for file in ['my_file1.txt', 'my_file2.txt']:
with open(file, 'r') as f:
for line in f:
impress(line)
Here we have to use two nested for-loops. The outer loop is for the files, while the inner one is for the lines within each file.
1. Using the FileInput Module
Well, nothing prevents usa from using the open()
function. However, the fileinput
module just provides us with a neater style of reading multiple text files into a single stream.
Offset of all, we demand to import the module. This is a Python congenital-in module so that we don't need to download anything.
import fileinput as fi
So, we can employ it for reading from the 2 files.
with fi.input(files=['my_file1.txt', 'my_file2.txt']) as f:
for line in f:
print(line)
Because the fileinput
module is designed for reading from multiple files, we don't demand to loop the file names anymore. Instead, the input()
role takes an iterable collection type such every bit a list as a parameter. Likewise, the smashing thing is that all the lines from both files are accessible in a single for-loop.
2. Use the FileInput Module with Glob
Sometimes, it may not be practical to have such a file proper noun list with all the names that are manually typed. Information technology is quite common to read all the files from a directory. Also, nosotros might be only interested in certain types of files.
In this case, we can use the glob
module which is another Python built-in module together with the fileinput
module.
We tin practice a elementary experiment before that. The bone
module can aid the states to listing all the files in the current working directory.
Information technology can be seen that there are many files other than the two text files. Therefore, we want to filter the file names because we want to read the text files only. We can use the glob
module as follows.
from glob import glob glob('*.txt')
Now, nosotros tin can put the glob()
function into the fileinput.input()
function as the parameter. So, only these two text files will exist read.
with fi.input(files=glob('*.txt')) as f:
for line in f:
print(line)
iii. Get the Metadata of Files
You lot may ask how can we know which file exactly the "line" is from when we are reading from the stream that is really combined with multiple files?
Indeed, using the open()
office with nested loop seems to be very easy to become such information because we can access the electric current file proper noun from the outer loop. Still, this is in fact much easier in the fileinput
module.
with fi.input(files=glob('*.txt')) as f:
for line in f:
print(f'File Proper noun: {f.filename()} | Line No: {f.lineno()} | {line}')
See, in the to a higher place lawmaking, we utilize the filename()
to access the current file that the line
comes from and the lineno()
to access the electric current index of the line we are getting.
4. When the Cursor Reaches a New File
Apart from that, there are more functions from the fileinput
module that we tin make use of. For example, what if we want to do something when nosotros reach a new file?
The role isfirstline()
helps us to determine whether we're reading the get-go line from a new file.
with fi.input(files=glob('*.txt')) as f:
for line in f:
if f.isfirstline():
print(f'> Start to read {f.filename()}...')
impress(line)
This could be very useful for logging purpose. Then, we can be indicated with the current progress.
5. Jump to the Adjacent File
We tin besides hands stop reading the current file and bound to the adjacent one. The function nextfile()
allows united states to do so.
Earlier we can demo this feature, please allow me re-write the two sample files.
with open('my_file1.txt', mode='westward') every bit f:
f.write('This is line 1-i\n')
f.write('stop reading\n')
f.write('This is line i-2\n') with open('my_file2.txt', style='w') as f:
f.write('This is line 2-one\n')
f.write('This is line two-ii\n')
The only difference from the original files is that I added a line of text cease reading
in the showtime text file. Permit'due south say that we want the fileinput
module to stop reading the first file and spring to the second when it sees such content.
with fi.input(files=glob('*.txt')) as f:
for line in f:
if f.isfirstline():
print(f'> Start to read {f.filename()}...')
if line == 'cease reading\n':
f.nextfile()
else:
print(line)
In the to a higher place code, another if-status is added. When the line text is cease reading
it volition jump to the next file. Therefore, nosotros can meet that the line "1–ii" was non read and output.
vi. Read Compress File Without Extracting
Sometimes we may take compressed files to read. Usually, nosotros will have to uncompress them earlier we can read the content. Notwithstanding, with the fileinput
module, we may not accept to extract the content from the compressed files earlier we can read it.
Let'southward make up a compressed text file using Gzip. This file will be used for demonstration purposes later.
import gzip
import shutil with open up('my_file1.txt', 'rb') as f_in:
with gzip.open('my_file.gz', 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
In the in a higher place code, we added the file my_file1.txt
into a compressed file using gzip. At present, allow's see how fileinput
can read it without actress steps for uncompressing.
with fi.input(files='my_file.gz', openhook=fi.hook_compressed) as f:
for line in f:
print(line)
By using the parameter openhook
and the flag fi.hook_compressed
, the gzip file will exist uncompressed on the fly.
The fileinput
module currently supports gzip and bzip2. Unfortunately not the other format.
Summary
In this commodity, I take introduced the Python built-in module fileinput
and how to use it to read multiple text files. Of course, it volition never replace the open()
function, but in terms of reading multiple files into a single stream, I believe information technology is the best practise.
If you feel my articles are helpful, please consider joining Medium Membership to support me and thousands of other writers! (Click the link above)
Source: https://towardsdatascience.com/the-best-practice-of-reading-text-files-in-python-509b1d4f5a4
0 Response to "Read Multiple Text File in the Same Time Python"
Post a Comment