Manipulating Files With Python
Manage Your Lovely Photos With Python! #PurePythonSeries — Episode #03
What is file manipulation?
So, file manipulations — creating a file, removing a directory, etc. — are very common operations in Python.
In this tutorial, let’s see some useful file manipulation commands and learn how to use them, specifically Modules: os and pathlib
Challenge — Make it Yourself!
Distribute the 20 .jpg images into the following directories: - electronics
- inspiration
- cities
- nature
- trips
- universityAccording to the theme column of MS Excel Worksheet headers (03_file_manipulation_techniques.xlsx):See the program working in gif below:
👉️Colab nb
👉️Download xls and photos.zip
Instructions:
Python and File Manipulation On Your Machine
Mastering the os and pathlib Modules:
The os
and pathlib
modules are powerful tools for managing folders and files on your computer. While other modules may offer additional capabilities depending on your specific needs, these two are essential for tackling a wide range of file manipulation tasks. This tutorial will guide you through using these modules to efficiently solve common challenges.
Our Challenge:
- You will need to distribute photos across the directories specified in the Excel sheet (
03_file_manipulation_techniques.xlsx
).
Key Points:
- We will be using
pathlib
because it handles file paths seamlessly across different operating systems. Whether you're on Windows, Mac, or Linux,pathlib
simplifies the path differences for us.
shutil Module:
- While you could use
os
andpathlib
for copying and pasting files, it's more complex. Fortunately, theshutil
module is here to make those tasks easier.
Let's get started!
Overview:
- Module’s importing
from pathlib import Path
- Listing All Files in Current Folder
files = Path.iterdir()
- Copying a File
import shutil
shutil.copy2('file_to_copy.extension', 'name_of_copied_created.extension')
- Moving a File; 2 methods:
Path('path/file.extension').rename('new_path/file.extension')
or
shutil.move(Path('path/file.extension'), Path('new_path/file.extension'))
1Step — Listing All Files in the Photos Folder:
Let’s start by diving in and listing all the files in the /photos
directory.
from pathlib import Path
#print(Path.cwd())
path = Path('sample_data/photos')
files = path.iterdir()
for file in files:
print(file)
2 Step — Creating two Lists to convert them into a dictionary later:
'''
Take the first 4 letters and store them in a dictionary along with the filename:
'jorg':jorge-zapata-j2ExxxnN_w8-unsplash.jpg
That way, later, we can map the first 4 digits with the image name .jpg photos\agustin-diaz-gargiulo-GTLJklnjn-E-unsplash.jpg
'''
name_key_list = []
file_value_list = []
import os
from pathlib import Path
# print(Path.cwd())
path = Path('sample_data/photos')
files = path.iterdir()
for file in files:
file_name = os.path.basename(file)
file_value_list.append(file_name)
artist_initial_letters = file_name[:4]
name_key_list.append(artist_initial_letters)
# here are the two list to convert to a dictionary
print(name_key_list)
print('\n')
print(file_value_list)
3 Step— Converting the Two List into a Dictionary:
name_file_dict = dict(zip(name_key_list , file_value_list))
print(name_file_dict)
4 Step — Testing the Dictionary (name_file_dict):
name_file_dict['fpca']
5 Step — Now, let’s check if a file, we’re looking for, exists in the folder:
if (path / Path('bruno-melo-XsAv0ItdT5w-unsplash.jpg')).exists():
print('Yes, there is a file with that name \o/')
6 Step — Creating a new folder (/organized):
Path('sample_data/photos/organized').mkdir()
7 Step —Copying our file into the newly created /organized folder:
import shutil
file_to_copy = Path(r'sample_data/photos/bruno-melo-XsAv0ItdT5w-unsplash.jpg')
file_to_paste = Path(r'sample_data/photos/organized/bruno-melo-XsAv0ItdT5w-unsplash.jpg')
shutil.copy2(file_to_copy, file_to_paste)
8 Step — Moving a file from /organized
to /organized/cities:
Path('sample_data/photos/organized/cities').mkdir()
shutil.move(Path(r'sample_data/photos/organized/bruno-melo-XsAv0ItdT5w-unsplash.jpg'), Path(r'sample_data/photos/organized/cities/bruno-melo-XsAv0ItdT5w-unsplash.jpg'))
Here’s the improved version:
from pathlib import Path
import shutil
# Define the source and destination paths
source_file = Path('sample_data/photos/organized/bruno-melo-XsAv0ItdT5w-unsplash.jpg')
destination_dir = Path('sample_data/photos/organized/cities')
destination_file = destination_dir / source_file.name
# Create the destination directory if it doesn't exist
destination_dir.mkdir(parents=True, exist_ok=True)
try:
# Move the file to the destination directory
shutil.move(str(source_file), str(destination_file))
print(f"File moved to {destination_file}")
except Exception as e:
print(f"An error occurred: {e}")
We’re doing great so far. Now, let’s tackle a real challenge and put our knowledge to the test.
— — — — Challenge — Make it Yourself! — — — —
Distribute the 20 images into the following directories based on the theme column in the MS Excel worksheet headers (03_file_manipulation_techniques.xlsx):
- electronics
- inspiration
- cities
- nature
- trips
- university
For example, Zapata's photo should be saved in the \electronics folder.
This ensures a well-organized photo album, right?
Note 1:
To extract a file name as text using pathlib, use Path.name or file.name:
path = Path('Folder/File.csv')
print(path.name) # Output: 'File.csv'
Note 2:
Here's my suggestion: since you'll need to loop over Excel files and lists,
consider creating two dictionaries—name_file_dict and name_theme_dict.
Use the artist's 4-letter code as the key,
and the file name and theme (directories) as the values.
— — — — — — — — — — — — — — — — — — —
1 Challenge_Step — Let’s initialize by reading the excel file with Pandas:
import pandas as pd
themes_df = pd.read_excel('sample_data/03_file_manipulation_techniques.xlsx')
themes_df.info()
2 Challenge_Step —Creating two lists to later convert into a dictionary:
- Keys: The first four letters of each artist’s name.
- Values: The theme of the artist’s photo, which will be used as directory names.
key_list = []
value_list = []
for i, artist in enumerate(themes_df['artist']):
artist = themes_df.loc[i, 'artist']
key_list.append(artist[:4].lower())
theme = themes_df.loc[i, 'theme']
value_list.append(theme)
subject = themes_df.loc[i, 'subject']
description = themes_df.loc[i, 'description']
# print(f'(artist)-> {artist} (Theme)-> {theme}')
# here are the two list to convert to a dictionary
print(key_list)
print('\n')
print(value_list)
3 Challenge_Step— Converting Two Lists into a Dictionary:
name_theme_dict = dict(zip(key_list , value_list))
print(name_theme_dict)
4 Challenge_Step— Testing the Dictionary:
name_theme_dict['fpca']
5 Challenge_Step — Creating 6 Directories (6 Themes):
from pathlib import Path
import shutil
themes = ['electronics', 'inspiration', 'cities', 'nature', 'trips', 'university']
for theme in themes:
#print(theme)
Path('sample_data/photos/{}'.format(theme)).mkdir()
6 Challenge_Final_Step —Loop through /photos
and check the dictionary for the initials of the four artists:
path = Path('sample_data/photos/')
files = path.iterdir()
for file in files:
file_name = file.name
if file_name[-3:] == 'jpg':
artist_initial_letters = file_name[:4]
#print(artist_initial_letters)
theme = name_theme_dict[artist_initial_letters]
#print(theme)
final_place = path / Path('{}/{}'.format(theme, file_name))
shutil.move(file, final_place)
Now let’s have some fun! — PANDAS Review!
Let’s Make Four Lists
from the PANDAS DATAFRAME made in Challenge_Step_1☝️
artist_list = []
theme_list = []
subject_list = []
description_list = []
for i, artist in enumerate(themes_df['artist']):
artist = themes_df.loc[i, 'artist']
artist_list.append(artist)
theme = themes_df.loc[i, 'theme']
theme_list.append(theme)
subject = themes_df.loc[i, 'subject']
subject_list.append(subject)
description = themes_df.loc[i, 'description']
description_list.append(description)
print(artist_list)
print('\n')
print(theme_list)
print('\n')
print(subject_list)
print('\n')
print(description_list)
Computer vision!
Hi Python Computer Vision — PIL! An Intro To Python Imaging Library #PyVisionSeries — Episode #00
Please, visit this post BY CLICKING HERE: 👉 link
import numpy as np
import matplotlib.pyplot as ptl
%matplotlib inline
from PIL import Image
i = artist_list.index('Jorge Zapata')
image = Image.open('sample_data/photos/electronics/jorge-zapata-j2ExxxnN_w8-unsplash.jpg')
pic_arr = np.asarray(image)
ptl.imshow(pic_arr)
print('Photo Art By: Jorge Zapata - Details:')
print('\tTheme: ',theme_list[i])
print('\tSubject: ',subject_list[i])
print('\tDescription: ',description_list[i])
- Now let’s make a program to automate what we’ve just did, okay?
'''http://localhost:8888/notebooks/Documents/2021/python/03_tecnicas_manipulacao_arquivos/Python_File_Manipulation/Python_File_Manipulation/03_file_manipulation_techniques.ipynb#Now-let's-make-a-program-to-automate-what-we-just-did,-okay?
Instructions:
you can search by artist name
and see what these lovingly chosen photos mean to me.
It is a tribute to the wonderful work of these artists \o/
List of Artists (cpoy/paste when running the program below):
['Jorge Zapata', 'Erik Mclean', 'Robo Wunderkind', 'Thimo van Leeuwen',
'Felix Girault', 'Harrison Broadbent', 'Erol Ahmed', 'Greg Rakozy',
'MARK ADRIANE', 'fpcamp', 'victor santos', 'Marco Túlio de Miranda',
'Tchelo Veiga', 'Claiton Conto', 'Bruno Melo', 'Marianna Smiley',
'Agustin Diaz Gargiulo', 'Possessed Photography', 'Maxim Hopman',
'Silvia Mc Donald']
default: Marco Túlio de Miranda
'''
path = Path('sample_data/photos/')
artist_name = input('Insert the name of the Artist: ')
idx = artist_list.index(artist_name)
if artist_name in artist_list:
artist_initial_letters = artist_name[:4].lower()
file_name = name_file_dict[artist_initial_letters]
theme = name_theme_dict[artist_initial_letters]
file_to_open = path / Path('{}/{}'.format(theme, file_name))
image = Image.open(file_to_open)
pic_arr = np.asarray(image)
ptl.imshow(pic_arr)
print(artist_name, ' Photo Art: ')
print('\tTheme: ',theme_list[idx])
print('\tSubject: ',subject_list[idx])
print('\tDescription: ',description_list[idx])
else:
print(f'{artist_name} not on the artist list :/ Try Again! :)')
I hope you enjoyed the lecture! 🎉
If you found this post helpful, please click the applause button and subscribe for more content like this.
See you next time!
👉Jupiter notebook link :)
👉excel file link
👉or collab link
👉git
Credits & References
Hashtag Treinamentos by João Paulo Rodrigues de Lira — Thank you dude!
Photos from https://unsplash.com/
Related Posts
00#Episode#PurePythonSeries — Lambda in Python — Python Lambda Desmistification
01#Episode#PurePythonSeries — Send Email in Python — Using Jupyter Notebook — How To Send Gmail In Python
02#Episode#PurePythonSeries — Automate Your Email With Python & Outlook — How To Create An Email Trigger System in Python
03#Episode#PurePythonSeries — Manipulating Files With Python — Manage Your Lovely Photos With Python! (this one)
04#Episode#PurePythonSeries — Pandas DataFrame Advanced — A Complete Notebook Review
05#Episode#PurePythonSeries — Is This Leap Year? Python Calendar — How To Calculate If The Year Is Leap Year and How Many Days Are In The Month
06#Episode#PurePythonSeries — List Comprehension In Python — Locked-in Secrets About List Comprehension
07#Episode#PurePythonSeries — Graphs — In Python — Extremely Simple Algorithms in Python
08#Episode#PurePythonSeries — Decorator in Python — How To Simplifying Your Code And Boost Your Function
10#Episode#PurePythonSeries — CS50 — A Taste of Python — Harvard Mario’s Challenge Solver \o/
11#Episode#PurePythonSeries — Python — Send Email Using SMTP — Send Mail To Any Internet Machine (SMTP or ESMTP)
12#Episode#PurePythonSeries — Advanced Python Technologies — qrcode, Speech Recognition in Python, Google Speech Recognition
13#Episode#PurePythonSeries — Advanced Python Technologies II — qFace Recognition w/ Jupyter Notebook & Ubuntu
14#Episode#PurePythonSeries — Advanced Python Technologies III — Face Recognition w/ Colab
15#Episode#PurePythonSeries — ISS Tracking Project — Get an Email alert when International Space Station (ISS) is above of us in the sky, at night
16#Episode#PurePythonSeries — Using Gemini Chat on Collab — Random Number Generation, List Manipulation & Rock-Paper-Scissors Game Implementations
17#Episode#PurePythonSeries — Python — Basics — Functions, OOP, file handling, calculator, loops
18#Episode#PurePythonSeries — Python — Efficient File Handling in Python — Best Practices and Common Methods
Credits & References
How To Construct For Loops in Python 3 https://www.digitalocean.com/community/tutorials/how-to-construct-for-loops-in-python-3
Reviewed in August 2024