Manipulating Files With Python
Manage Your Lovely Photos With Python! #PurePythonSeries — Episode #03
What is file manipulation?
So, file manipulations — creating a file, removing a directory, etc. — are very common operations in Python.
In this tutorial, let’s see some useful file manipulation commands and learn how to use them, specifically Modules: os and pathlib
Challenge — Make it Yourself!
Distribute the 20 .jpg images into the following directories: - electronics
- inspiration
- cities
- nature
- trips
- universityAccording to the theme column of MS Excel Worksheet headers (03_file_manipulation_techniques.xlsx):Download all files belowSee the program working in gif below:
Python and File Manipulation On Your Machine
os and pathlib Modules Tutorial:
The os and pathlib modules are one of the best modules / libraries to control folders and files on your computer. There are a few other modules that can help to depend on what you’re looking to do, but in essence, we’ll be able to use these modules to solve our challenges.
- our challenge:
You will have to distribute the photos of the directories indicated in the excel sheet (03_file_manipulation_techniques.xlsx).
- Noteworthy
We will use pathlib here because it works fine regardless of the operating system you are using.
Usually, the paths on Windows, Mac, or Linux computers are different, but this is something that pathlib will solve for us nicely.
- shutil Module
For the actions of copying and pasting file, we can even do it with os and pathlib ( modules, but it is more difficult.
BUT, there is the shutil module to help us with this o/
Let’s get it on!
Overview:
- Module’s importing
# from pathlib import Path
- Listing All Files in Current Folder
files = Path.iterdir()
- Copying a File
import shutilshutil.copy2('file_to_copy.extension', 'name_of_copied_created.extension')
- Moving a File
2 methods:Path('path/file.extension').rename('new_path/file.extension')orshutil.move(Path('path/file.extension'), Path('new_path/file.extension'))
Let’s Get Our Feet Wet:
- Step 1 — Let’s list all files in a folder / photos
from pathlib import Path#print(Path.cwd())path = Path('photos')files = path.iterdir()
for file in files:
print(file)photos\agustin-diaz-gargiulo-GTLJklnjn-E-unsplash.jpg
photos\bruno-melo-XsAv0ItdT5w-unsplash.jpg
photos\claiton-conto-phaf9ASn3Do-unsplash.jpg
photos\erik-mclean-Cf-kY8HFJOs-unsplash.jpg
photos\erol-ahmed-aIYFR0vbADk-unsplash.jpg
photos\felix-girault-QyhfOCA_ldM-unsplash.jpg
photos\fpcamp-opUYWcQVQHg-unsplash.jpg
photos\greg-rakozy-oMpAz-DN-9I-unsplash.jpg
photos\harrison-broadbent-c3YpscwJb04-unsplash.jpg
photos\jorge-zapata-j2ExxxnN_w8-unsplash.jpg
photos\marco-tulio-de-miranda-NJrRhmQPLZc-unsplash.jpg
photos\marianna-smiley---JI4tAhTpE-unsplash.jpg
photos\mark-adriane-muS2RraYRuQ-unsplash.jpg
photos\maxim-hopman-Hin-rzhOdWs-unsplash.jpg
photos\possessed-photography-YKW0JjP7rlU-unsplash.jpg
photos\robo-wunderkind-oUgZVBaGcEQ-unsplash.jpg
photos\silvia-mc-donald-0mFarJHSy-M-unsplash.jpg
photos\tchelo-veiga-GOZKF3826Qc-unsplash.jpg
photos\thimo-van-leeuwen-EyAwxrQqAUE-unsplash.jpg
photos\victor-santos-pRcxVWRCs3k-unsplash.jpg
- Step 2 — Creating two Lists to convert them into a dictionary later
'''
Take the first 4 letters and store them in a dictionary along with the filename:'jorg':'jorge-zapata-j2ExxxnN_w8-unsplash.jpg'That way, later, we can map the first 4 digits with the image name .jpg:photos\'agustin-diaz-gargiulo-GTLJklnjn-E-unsplash.jpg''''
name_key_list = []
file_value_list = []import os
from pathlib import Path#print(Path.cwd())path = Path('photos')files = path.iterdir()
for file in files:
file_name = os.path.basename(file)
file_value_list.append(file_name)
artist_initial_letters = file_name[:4]
name_key_list.append(artist_initial_letters)# here are the two list to convert to a dictionary
print(name_key_list)
print('\n')
print(file_value_list)
['agus', 'brun', 'clai', 'erik', 'erol', 'feli', 'fpca', 'greg', 'harr', 'jorg', 'marc', 'mari', 'mark', 'maxi', 'poss', 'robo', 'silv', 'tche', 'thim', 'vict']
['agustin-diaz-gargiulo-GTLJklnjn-E-unsplash.jpg', 'bruno-melo-XsAv0ItdT5w-unsplash.jpg', 'claiton-conto-phaf9ASn3Do-unsplash.jpg', 'erik-mclean-Cf-kY8HFJOs-unsplash.jpg', 'erol-ahmed-aIYFR0vbADk-unsplash.jpg', 'felix-girault-QyhfOCA_ldM-unsplash.jpg', 'fpcamp-opUYWcQVQHg-unsplash.jpg', 'greg-rakozy-oMpAz-DN-9I-unsplash.jpg', 'harrison-broadbent-c3YpscwJb04-unsplash.jpg', 'jorge-zapata-j2ExxxnN_w8-unsplash.jpg', 'marco-tulio-de-miranda-NJrRhmQPLZc-unsplash.jpg', 'marianna-smiley---JI4tAhTpE-unsplash.jpg', 'mark-adriane-muS2RraYRuQ-unsplash.jpg', 'maxim-hopman-Hin-rzhOdWs-unsplash.jpg', 'possessed-photography-YKW0JjP7rlU-unsplash.jpg', 'robo-wunderkind-oUgZVBaGcEQ-unsplash.jpg', 'silvia-mc-donald-0mFarJHSy-M-unsplash.jpg', 'tchelo-veiga-GOZKF3826Qc-unsplash.jpg', 'thimo-van-leeuwen-EyAwxrQqAUE-unsplash.jpg', 'victor-santos-pRcxVWRCs3k-unsplash.jpg']
- Step 3 — Converting the Two List into a Dictionary
name_file_dict = dict(zip(name_key_list , file_value_list))
print(name_file_dict){'agus': 'agustin-diaz-gargiulo-GTLJklnjn-E-unsplash.jpg', 'brun': 'bruno-melo-XsAv0ItdT5w-unsplash.jpg', 'clai': 'claiton-conto-phaf9ASn3Do-unsplash.jpg', 'erik': 'erik-mclean-Cf-kY8HFJOs-unsplash.jpg', 'erol': 'erol-ahmed-aIYFR0vbADk-unsplash.jpg', 'feli': 'felix-girault-QyhfOCA_ldM-unsplash.jpg', 'fpca': 'fpcamp-opUYWcQVQHg-unsplash.jpg', 'greg': 'greg-rakozy-oMpAz-DN-9I-unsplash.jpg', 'harr': 'harrison-broadbent-c3YpscwJb04-unsplash.jpg', 'jorg': 'jorge-zapata-j2ExxxnN_w8-unsplash.jpg', 'marc': 'marco-tulio-de-miranda-NJrRhmQPLZc-unsplash.jpg', 'mari': 'marianna-smiley---JI4tAhTpE-unsplash.jpg', 'mark': 'mark-adriane-muS2RraYRuQ-unsplash.jpg', 'maxi': 'maxim-hopman-Hin-rzhOdWs-unsplash.jpg', 'poss': 'possessed-photography-YKW0JjP7rlU-unsplash.jpg', 'robo': 'robo-wunderkind-oUgZVBaGcEQ-unsplash.jpg', 'silv': 'silvia-mc-donald-0mFarJHSy-M-unsplash.jpg', 'tche': 'tchelo-veiga-GOZKF3826Qc-unsplash.jpg', 'thim': 'thimo-van-leeuwen-EyAwxrQqAUE-unsplash.jpg', 'vict': 'victor-santos-pRcxVWRCs3k-unsplash.jpg'}
- Step 4 — Testing the Dictionary (name_file_dict)
name_file_dict['fpca']'fpcamp-opUYWcQVQHg-unsplash.jpg'
- Step 5 — Now, let’s check if a file, we’re looking for, exists in the folder.
if (path / Path('bruno-melo-XsAv0ItdT5w-unsplash.jpg')).exists():
print('Yes, there is a file with that name \o/'Yes, there is a file with that name \o/
- Step 6 — Creating a new folder (/organized)
Path('photos/organized').mkdir()
- Step 7 — Creating a copy of our file in the new folder we’ve just created (/organized)
import shutilfile_to_copy = Path(r'photos\bruno-melo-XsAv0ItdT5w-unsplash.jpg')
file_to_paste = Path(r'photos\organized\bruno-melo-XsAv0ItdT5w-unsplash.jpg')shutil.copy2(file_to_copy, file_to_paste)WindowsPath('photos/organized/bruno-melo-XsAv0ItdT5w-unsplash.jpg')
- Step 8 — Moving a file from one place to another (/organized → /organized/cities)
Path('photos/organized/cities').mkdir()shutil.move(Path(r'photos/organized//bruno-melo-XsAv0ItdT5w-unsplash.jpg'), \
Path(r'photos/organized/cities/bruno-melo-XsAv0ItdT5w-unsplash.jpg'))WindowsPath('photos/organized/cities/bruno-melo-XsAv0ItdT5w-unsplash.jpg')
— — — — Challenge — Make it Yourself! — — — —
Distribute the 20 images in the following directories: - electronics
- inspiration
- cities
- nature
- trips
- universityAccording to the *theme* column of *MS Excel* Worksheet headers (03_file_manipulation_techniques.xlsx):
for instance, The Zapata's photo must be saved in the \electronics folder. That way we'll have a well-organized photo album, isn't it?Note 1 :
To get the name of a file as text in pathlib, you can use Path.name or file.name:
path = Path('Folder/File.csv')
print(path.name) -> response: 'File.csv'Note 2:
Let me offer my reasoning: as you will have to loop over excel files and list, consider creating two dictionaries (name_file_dict and name_theme_dict) from lists, with key equal to artist (4_letter) and value equal to file name and theme (directories).
— — — — — — — — — — — — — — — — — — —
- Challenge_Step_1 — Let’s initialize by reading the excel file with Pandas:
import pandas as pdthemes_df = pd.read_excel('03_file_manipulation_techniques.xlsx')
themes_df.info()<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20 entries, 0 to 19
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 artist 20 non-null object
1 theme 20 non-null object
2 subject 20 non-null object
3 description 20 non-null object
4 Unnamed: 4 0 non-null float64
5 Unnamed: 5 0 non-null float64
6 Unnamed: 6 0 non-null float64
dtypes: float64(3), object(4)
memory usage: 1.2+ KB
- Challenge_Step_2 — Creating two lists to convert them into a dictionary later;
For key, we have the first four letters of the Artist and as value the theme of that artist’s photo, which will be the Directories Names
key_list = []
value_list = []for i, artist in enumerate(themes_df['artist']):
artist = themes_df.loc[i, 'artist']
key_list.append(artist[:4].lower())
theme = themes_df.loc[i, 'theme']
value_list.append(theme)
subject = themes_df.loc[i, 'subject']
description = themes_df.loc[i, 'description']
#print(f'(artist)-> {artist} (Theme)-> {theme}')# here are the two list to convert to a dictionary
print(key_list)
print('\n')
print(value_list)['jorg', 'erik', 'robo', 'thim', 'feli', 'harr', 'erol', 'greg', 'mark', 'fpca', 'vict', 'marc', 'tche', 'clai', 'brun', 'mari', 'agus', 'poss', 'maxi', 'silv']
['electronics', 'electronics', 'electronics', 'inspiration', 'electronics', 'electronics', 'nature', 'inspiration', 'inspiration', 'trips', 'trips', 'university', 'cities', 'cities', 'cities', 'cities', 'cities', 'electronics', 'electronics', 'cities']
- Challenge_Step_3 — Converting the Two List into a Dictionary
name_theme_dict = dict(zip(key_list , value_list))
print(name_theme_dict){'jorg': 'electronics', 'erik': 'electronics', 'robo': 'electronics', 'thim': 'inspiration', 'feli': 'electronics', 'harr': 'electronics', 'erol': 'nature', 'greg': 'inspiration', 'mark': 'inspiration', 'fpca': 'trips', 'vict': 'trips', 'marc': 'university', 'tche': 'cities', 'clai': 'cities', 'brun': 'cities', 'mari': 'cities', 'agus': 'cities', 'poss': 'electronics', 'maxi': 'electronics', 'silv': 'cities'}
- Challenge_Step_4 — Testing the Dictionary
name_theme_dict['fpca']'trips'
- Challenge_Step_5 -Creating The 6 Directory (6 Themes)
from pathlib import Path
import shutilthemes = ['electronics', 'inspiration', 'cities', 'nature', 'trips', 'university']
for theme in themes:
#print(theme)
Path('photos/{}'.format(theme)).mkdir()
- Challenge_Final_Step — Loop Over /photos and Look Over the Dictionary for the 4 Artist’s Initial Letters
path = Path('photos/')
files = path.iterdir()for file in files:
file_name = file.name
if file_name[-3:] == 'jpg':
artist_initial_letters = file_name[:4]
theme = name_theme_dict[artist_initial_letters]
#print(theme)
final_place = path / Path('{}/{}'.format(theme, file_name))
shutil.move(file, final_place)
Now let’s have some fun! — PANDAS Review!
Let’s Make Four Lists
from the PANDAS DATAFRAME made in Challenge_Step_1☝️
artist_list = []
theme_list = []
subject_list = []
description_list = []for i, artist in enumerate(themes_df['artist']):
artist = themes_df.loc[i, 'artist']
artist_list.append(artist)
theme = themes_df.loc[i, 'theme']
theme_list.append(theme)
subject = themes_df.loc[i, 'subject']
subject_list.append(subject)
description = themes_df.loc[i, 'description']
description_list.append(description)
print(artist_list)
print('\n')
print(theme_list)
print('\n')
print(subject_list)
print('\n')
print(description_list)
['Jorge Zapata', 'Erik Mclean', 'Robo Wunderkind', 'Thimo van Leeuwen', 'Felix Girault', 'Harrison Broadbent', 'Erol Ahmed', 'Greg Rakozy', 'MARK ADRIANE', 'fpcamp', 'victor santos', 'Marco Túlio de Miranda', 'Tchelo Veiga', 'Claiton Conto', 'Bruno Melo', 'Marianna Smiley', 'Agustin Diaz Gargiulo', 'Possessed Photography', 'Maxim Hopman', 'Silvia Mc Donald']
['electronics', 'electronics', 'electronics', 'inspiration', 'electronics', 'electronics', 'nature', 'inspiration', 'inspiration', 'trips', 'trips', 'university', 'cities', 'cities', 'cities', 'cities', 'cities', 'electronics', 'electronics', 'cities']
['robotic', 'robotic', 'robotic', 'spine', 'arduino', 'raspiberry pi', 'cactus', 'universe', 'good vibe', 'caraca', 'rio das ostras', 'vicosa', 'ouro preto', 'ouro preto', 'manaus', 'salvador', 'rio de janeiro', 'artificial intelligence', 'mac', 'espaco alternativo']
['I am studing artificial intelligence - I received a certificate from huawei in artificial intelligence(https://medium.com/jungletronics/huawei-certification-heres-the-thing-493a6d60d478)', "wall-E, who doesn't remember star wars? as a young man I was amazed by the spillberg films.", 'Coding for kids with my site, Pleae, visit my website for children, learn about Lego, Raspi, etc: kidstronics - https://medium.com/kidstronics', 'everything has a pattern - A (...)', "without a doubt, the best platform...I'll still have one - Mac Pro is designed for pros who need the ultimate in CPU performance.", 'What is the cheapest gym to join? For me here is where i do i do jog to get in shape: Espaço Alternativo - Porto Velho City - Ro - Brazil']
Computer vision!
Hi Python Computer Vision — PIL! An Intro To Python Imaging Library #PyVisionSeries — Episode #00
Please, visit this post BY CLICKING HERE: 👉 link
import numpy as np
import matplotlib.pyplot as ptl
%matplotlib inline
from PIL import Imageidx = artist_list.index('Jorge Zapata')image = Image.open('photos\electronics\jorge-zapata-j2ExxxnN_w8-unsplash.jpg')
pic_arr = np.asarray(image)
ptl.imshow(pic_arr)print('Photo Art By: Jorge Zapata - Details:')
print('\tTheme: ',theme_list[idx])
print('\tSubject: ',subject_list[idx])
print('\tDescription: ',description_list[idx])
- Now let’s make a program to automate what we’ve just did, okay?
'''
Now-let's-make-a-program-to-automate-what-we-just-did,-okay?
Instructions:you can search by artist name
and see what these lovingly chosen photos mean to me.It is a tribute to the wonderful work of these artists \o/List of Artists (cpoy/paste when running the program below):['Jorge Zapata', 'Erik Mclean', 'Robo Wunderkind', 'Thimo van Leeuwen','Felix Girault', 'Harrison Broadbent', 'Erol Ahmed', 'Greg Rakozy','MARK ADRIANE', 'fpcamp', 'victor santos', 'Marco Túlio de Miranda','Tchelo Veiga', 'Claiton Conto', 'Bruno Melo', 'Marianna Smiley','Agustin Diaz Gargiulo', 'Possessed Photography', 'Maxim Hopman', 'Silvia Mc Donald']default: Marco Túlio de Miranda'''path = Path('photos/')artist_name = input('Insert the name of the Artist: ')
idx = artist_list.index(artist_name)if artist_name in artist_list:
artist_initial_letters = artist_name[:4].lower()
file_name = name_file_dict[artist_initial_letters]
theme = name_theme_dict[artist_initial_letters]
file_to_open = path / Path('{}/{}'.format(theme, file_name))
image = Image.open(file_to_open)
pic_arr = np.asarray(image)
ptl.imshow(pic_arr)
print(artist_name, ' Photo Art: ')
print('\tTheme: ',theme_list[idx])
print('\tSubject: ',subject_list[idx])
print('\tDescription: ',description_list[idx])
else:
print(f'{artist_name} not on the artist list :/ Try Again! :)')
I hope you enjoyed that lecture o/
If you find this post helpful, please click the applause button and subscribe to the page for more articles like this one.
Until next time!
👉Jupiter notebook link :)
👉excel file link
👉or collab link
👉git
Credits & References
Hashtag Treinamentos by João Paulo Rodrigues de Lira — Thank you dude!
Photos from https://unsplash.com/
Related Posts
00#Episode#PurePythonSeries — Lambda in Python — Python Lambda Desmistification
01#Episode#PurePythonSeries — Send Email in Python — Using Jupyter Notebook — How To Send Gmail In Python
02#Episode#PurePythonSeries — Automate Your Email With Python & Outlook — How To Create An Email Trigger System in Python
03#Episode#PurePythonSeries — Manipulating Files With Python —Manage Your Lovely Photos With Python! (this one)
04#Episode#PurePythonSeries — Pandas DataFrame Advanced — A Complete Notebook Review
05#Episode#PurePythonSeries — Is This Leap Year? Python Calendar — How To Calculate If The Year Is Leap Year and How Many Days Are In The Month
Credits & References
How To Construct For Loops in Python 3 https://www.digitalocean.com/community/tutorials/how-to-construct-for-loops-in-python-3