Manipulating Files With Python

Manage Your Lovely Photos With Python! #PurePythonSeries — Episode #03

J3
Jungletronics
9 min readSep 10, 2021

--

What is file manipulation?

So, file manipulations — creating a file, removing a directory, etc. — are very common operations in Python.

In this tutorial, let’s see some useful file manipulation commands and learn how to use them, specifically Modules: os and pathlib

Challenge — Make it Yourself!

Distribute the 20 .jpg images into the following directories:            - electronics
- inspiration
- cities
- nature
- trips
- university
According to the theme column of MS Excel Worksheet headers (03_file_manipulation_techniques.xlsx):Download all files belowSee the program working in gif below:
Gif 1 . Program working, creating a directory, and manipulating the files.
Fig 1. Here is the beginning of everything!

Python and File Manipulation On Your Machine

os and pathlib Modules Tutorial:

The os and pathlib modules are one of the best modules / libraries to control folders and files on your computer. There are a few other modules that can help to depend on what you’re looking to do, but in essence, we’ll be able to use these modules to solve our challenges.

  • our challenge:

You will have to distribute the photos of the directories indicated in the excel sheet (03_file_manipulation_techniques.xlsx).

  • Noteworthy

We will use pathlib here because it works fine regardless of the operating system you are using.

Usually, the paths on Windows, Mac, or Linux computers are different, but this is something that pathlib will solve for us nicely.

  • shutil Module

For the actions of copying and pasting file, we can even do it with os and pathlib ( modules, but it is more difficult.

BUT, there is the shutil module to help us with this o/

Let’s get it on!

Overview:

  • Module’s importing
# from pathlib import Path
  • Listing All Files in Current Folder
files = Path.iterdir()
  • Copying a File
import shutilshutil.copy2('file_to_copy.extension', 'name_of_copied_created.extension')
  • Moving a File
2 methods:Path('path/file.extension').rename('new_path/file.extension')orshutil.move(Path('path/file.extension'), Path('new_path/file.extension'))

Let’s Get Our Feet Wet:

  • Step 1 — Let’s list all files in a folder / photos
from pathlib import Path#print(Path.cwd())path = Path('photos')files = path.iterdir()
for file in files:
print(file)
photos\agustin-diaz-gargiulo-GTLJklnjn-E-unsplash.jpg
photos\bruno-melo-XsAv0ItdT5w-unsplash.jpg
photos\claiton-conto-phaf9ASn3Do-unsplash.jpg
photos\erik-mclean-Cf-kY8HFJOs-unsplash.jpg
photos\erol-ahmed-aIYFR0vbADk-unsplash.jpg
photos\felix-girault-QyhfOCA_ldM-unsplash.jpg
photos\fpcamp-opUYWcQVQHg-unsplash.jpg
photos\greg-rakozy-oMpAz-DN-9I-unsplash.jpg
photos\harrison-broadbent-c3YpscwJb04-unsplash.jpg
photos\jorge-zapata-j2ExxxnN_w8-unsplash.jpg
photos\marco-tulio-de-miranda-NJrRhmQPLZc-unsplash.jpg
photos\marianna-smiley---JI4tAhTpE-unsplash.jpg
photos\mark-adriane-muS2RraYRuQ-unsplash.jpg
photos\maxim-hopman-Hin-rzhOdWs-unsplash.jpg
photos\possessed-photography-YKW0JjP7rlU-unsplash.jpg
photos\robo-wunderkind-oUgZVBaGcEQ-unsplash.jpg
photos\silvia-mc-donald-0mFarJHSy-M-unsplash.jpg
photos\tchelo-veiga-GOZKF3826Qc-unsplash.jpg
photos\thimo-van-leeuwen-EyAwxrQqAUE-unsplash.jpg
photos\victor-santos-pRcxVWRCs3k-unsplash.jpg
  • Step 2 — Creating two Lists to convert them into a dictionary later
'''
Take the first 4 letters and store them in a dictionary along with the filename:
'jorg':'jorge-zapata-j2ExxxnN_w8-unsplash.jpg'That way, later, we can map the first 4 digits with the image name .jpg:photos\'agustin-diaz-gargiulo-GTLJklnjn-E-unsplash.jpg''''
name_key_list = []
file_value_list = []
import os
from pathlib import Path
#print(Path.cwd())path = Path('photos')files = path.iterdir()
for file in files:
file_name = os.path.basename(file)
file_value_list.append(file_name)

artist_initial_letters = file_name[:4]
name_key_list.append(artist_initial_letters)
# here are the two list to convert to a dictionary
print(name_key_list)
print('\n')
print(file_value_list)
['agus', 'brun', 'clai', 'erik', 'erol', 'feli', 'fpca', 'greg', 'harr', 'jorg', 'marc', 'mari', 'mark', 'maxi', 'poss', 'robo', 'silv', 'tche', 'thim', 'vict']


['agustin-diaz-gargiulo-GTLJklnjn-E-unsplash.jpg', 'bruno-melo-XsAv0ItdT5w-unsplash.jpg', 'claiton-conto-phaf9ASn3Do-unsplash.jpg', 'erik-mclean-Cf-kY8HFJOs-unsplash.jpg', 'erol-ahmed-aIYFR0vbADk-unsplash.jpg', 'felix-girault-QyhfOCA_ldM-unsplash.jpg', 'fpcamp-opUYWcQVQHg-unsplash.jpg', 'greg-rakozy-oMpAz-DN-9I-unsplash.jpg', 'harrison-broadbent-c3YpscwJb04-unsplash.jpg', 'jorge-zapata-j2ExxxnN_w8-unsplash.jpg', 'marco-tulio-de-miranda-NJrRhmQPLZc-unsplash.jpg', 'marianna-smiley---JI4tAhTpE-unsplash.jpg', 'mark-adriane-muS2RraYRuQ-unsplash.jpg', 'maxim-hopman-Hin-rzhOdWs-unsplash.jpg', 'possessed-photography-YKW0JjP7rlU-unsplash.jpg', 'robo-wunderkind-oUgZVBaGcEQ-unsplash.jpg', 'silvia-mc-donald-0mFarJHSy-M-unsplash.jpg', 'tchelo-veiga-GOZKF3826Qc-unsplash.jpg', 'thimo-van-leeuwen-EyAwxrQqAUE-unsplash.jpg', 'victor-santos-pRcxVWRCs3k-unsplash.jpg']
  • Step 3 — Converting the Two List into a Dictionary
name_file_dict = dict(zip(name_key_list , file_value_list))
print(name_file_dict)
{'agus': 'agustin-diaz-gargiulo-GTLJklnjn-E-unsplash.jpg', 'brun': 'bruno-melo-XsAv0ItdT5w-unsplash.jpg', 'clai': 'claiton-conto-phaf9ASn3Do-unsplash.jpg', 'erik': 'erik-mclean-Cf-kY8HFJOs-unsplash.jpg', 'erol': 'erol-ahmed-aIYFR0vbADk-unsplash.jpg', 'feli': 'felix-girault-QyhfOCA_ldM-unsplash.jpg', 'fpca': 'fpcamp-opUYWcQVQHg-unsplash.jpg', 'greg': 'greg-rakozy-oMpAz-DN-9I-unsplash.jpg', 'harr': 'harrison-broadbent-c3YpscwJb04-unsplash.jpg', 'jorg': 'jorge-zapata-j2ExxxnN_w8-unsplash.jpg', 'marc': 'marco-tulio-de-miranda-NJrRhmQPLZc-unsplash.jpg', 'mari': 'marianna-smiley---JI4tAhTpE-unsplash.jpg', 'mark': 'mark-adriane-muS2RraYRuQ-unsplash.jpg', 'maxi': 'maxim-hopman-Hin-rzhOdWs-unsplash.jpg', 'poss': 'possessed-photography-YKW0JjP7rlU-unsplash.jpg', 'robo': 'robo-wunderkind-oUgZVBaGcEQ-unsplash.jpg', 'silv': 'silvia-mc-donald-0mFarJHSy-M-unsplash.jpg', 'tche': 'tchelo-veiga-GOZKF3826Qc-unsplash.jpg', 'thim': 'thimo-van-leeuwen-EyAwxrQqAUE-unsplash.jpg', 'vict': 'victor-santos-pRcxVWRCs3k-unsplash.jpg'}
  • Step 4 — Testing the Dictionary (name_file_dict)
name_file_dict['fpca']'fpcamp-opUYWcQVQHg-unsplash.jpg'
  • Step 5 — Now, let’s check if a file, we’re looking for, exists in the folder.
if (path / Path('bruno-melo-XsAv0ItdT5w-unsplash.jpg')).exists():
print('Yes, there is a file with that name \o/'
Yes, there is a file with that name \o/
  • Step 6 — Creating a new folder (/organized)
Path('photos/organized').mkdir()
Fig 2. Creating /organized dir and coping .jpg to it! This is The Victoria amazonica has very large leaves, up to 3 m (10 ft) in diameter, that float on the water’s surface on a submerged stalk, 7–8 m (23–26 ft) in length.
  • Step 7 — Creating a copy of our file in the new folder we’ve just created (/organized)
import shutilfile_to_copy = Path(r'photos\bruno-melo-XsAv0ItdT5w-unsplash.jpg')
file_to_paste = Path(r'photos\organized\bruno-melo-XsAv0ItdT5w-unsplash.jpg')
shutil.copy2(file_to_copy, file_to_paste)WindowsPath('photos/organized/bruno-melo-XsAv0ItdT5w-unsplash.jpg')
  • Step 8 — Moving a file from one place to another (/organized → /organized/cities)
Path('photos/organized/cities').mkdir()shutil.move(Path(r'photos/organized//bruno-melo-XsAv0ItdT5w-unsplash.jpg'), \
Path(r'photos/organized/cities/bruno-melo-XsAv0ItdT5w-unsplash.jpg'))
WindowsPath('photos/organized/cities/bruno-melo-XsAv0ItdT5w-unsplash.jpg')

— — — — Challenge — Make it Yourself! — — — —

Distribute the 20 images in the following directories:            - electronics
- inspiration
- cities
- nature
- trips
- university
According to the *theme* column of *MS Excel* Worksheet headers (03_file_manipulation_techniques.xlsx):
Fig 3. Make a directory for each theme.
for instance, The Zapata's photo must be saved in the \electronics folder.                That way we'll have a well-organized photo album, isn't it?Note 1 : 
To get the name of a file as text in pathlib, you can use Path.name or file.name:
path = Path('Folder/File.csv')
print(path.name) -> response: 'File.csv'
Note 2:
Let me offer my reasoning: as you will have to loop over excel files and list, consider creating two dictionaries (name_file_dict and name_theme_dict) from lists, with key equal to artist (4_letter) and value equal to file name and theme (directories).

— — — — — — — — — — — — — — — — — — —

  • Challenge_Step_1 — Let’s initialize by reading the excel file with Pandas:
import pandas as pdthemes_df = pd.read_excel('03_file_manipulation_techniques.xlsx')
themes_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20 entries, 0 to 19
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 artist 20 non-null object
1 theme 20 non-null object
2 subject 20 non-null object
3 description 20 non-null object
4 Unnamed: 4 0 non-null float64
5 Unnamed: 5 0 non-null float64
6 Unnamed: 6 0 non-null float64
dtypes: float64(3), object(4)
memory usage: 1.2+ KB
  • Challenge_Step_2 — Creating two lists to convert them into a dictionary later;

For key, we have the first four letters of the Artist and as value the theme of that artist’s photo, which will be the Directories Names

key_list = []
value_list = []
for i, artist in enumerate(themes_df['artist']):
artist = themes_df.loc[i, 'artist']
key_list.append(artist[:4].lower())

theme = themes_df.loc[i, 'theme']
value_list.append(theme)

subject = themes_df.loc[i, 'subject']
description = themes_df.loc[i, 'description']
#print(f'(artist)-> {artist} (Theme)-> {theme}')
# here are the two list to convert to a dictionary
print(key_list)
print('\n')
print(value_list)
['jorg', 'erik', 'robo', 'thim', 'feli', 'harr', 'erol', 'greg', 'mark', 'fpca', 'vict', 'marc', 'tche', 'clai', 'brun', 'mari', 'agus', 'poss', 'maxi', 'silv']


['electronics', 'electronics', 'electronics', 'inspiration', 'electronics', 'electronics', 'nature', 'inspiration', 'inspiration', 'trips', 'trips', 'university', 'cities', 'cities', 'cities', 'cities', 'cities', 'electronics', 'electronics', 'cities']
  • Challenge_Step_3 — Converting the Two List into a Dictionary
name_theme_dict = dict(zip(key_list , value_list))
print(name_theme_dict)
{'jorg': 'electronics', 'erik': 'electronics', 'robo': 'electronics', 'thim': 'inspiration', 'feli': 'electronics', 'harr': 'electronics', 'erol': 'nature', 'greg': 'inspiration', 'mark': 'inspiration', 'fpca': 'trips', 'vict': 'trips', 'marc': 'university', 'tche': 'cities', 'clai': 'cities', 'brun': 'cities', 'mari': 'cities', 'agus': 'cities', 'poss': 'electronics', 'maxi': 'electronics', 'silv': 'cities'}
  • Challenge_Step_4 — Testing the Dictionary
name_theme_dict['fpca']'trips'
  • Challenge_Step_5 -Creating The 6 Directory (6 Themes)
from pathlib import Path
import shutil
themes = ['electronics', 'inspiration', 'cities', 'nature', 'trips', 'university']
for theme in themes:
#print(theme)
Path('photos/{}'.format(theme)).mkdir()
Fig 4. Directories Just created! Now, transfer the photos into each one (see MS Excel sheet)
  • Challenge_Final_Step — Loop Over /photos and Look Over the Dictionary for the 4 Artist’s Initial Letters
path = Path('photos/')
files = path.iterdir()
for file in files:
file_name = file.name
if file_name[-3:] == 'jpg':
artist_initial_letters = file_name[:4]
theme = name_theme_dict[artist_initial_letters]
#print(theme)
final_place = path / Path('{}/{}'.format(theme, file_name))
shutil.move(file, final_place)

Now let’s have some fun! — PANDAS Review!

Let’s Make Four Lists

from the PANDAS DATAFRAME made in Challenge_Step_1☝️

artist_list = []
theme_list = []
subject_list = []
description_list = []
for i, artist in enumerate(themes_df['artist']):
artist = themes_df.loc[i, 'artist']
artist_list.append(artist)

theme = themes_df.loc[i, 'theme']
theme_list.append(theme)

subject = themes_df.loc[i, 'subject']
subject_list.append(subject)

description = themes_df.loc[i, 'description']
description_list.append(description)

print(artist_list)
print('\n')
print(theme_list)
print('\n')
print(subject_list)
print('\n')
print(description_list)
['Jorge Zapata', 'Erik Mclean', 'Robo Wunderkind', 'Thimo van Leeuwen', 'Felix Girault', 'Harrison Broadbent', 'Erol Ahmed', 'Greg Rakozy', 'MARK ADRIANE', 'fpcamp', 'victor santos', 'Marco Túlio de Miranda', 'Tchelo Veiga', 'Claiton Conto', 'Bruno Melo', 'Marianna Smiley', 'Agustin Diaz Gargiulo', 'Possessed Photography', 'Maxim Hopman', 'Silvia Mc Donald']


['electronics', 'electronics', 'electronics', 'inspiration', 'electronics', 'electronics', 'nature', 'inspiration', 'inspiration', 'trips', 'trips', 'university', 'cities', 'cities', 'cities', 'cities', 'cities', 'electronics', 'electronics', 'cities']


['robotic', 'robotic', 'robotic', 'spine', 'arduino', 'raspiberry pi', 'cactus', 'universe', 'good vibe', 'caraca', 'rio das ostras', 'vicosa', 'ouro preto', 'ouro preto', 'manaus', 'salvador', 'rio de janeiro', 'artificial intelligence', 'mac', 'espaco alternativo']


['I am studing artificial intelligence - I received a certificate from huawei in artificial intelligence(https://medium.com/jungletronics/huawei-certification-heres-the-thing-493a6d60d478)', "wall-E, who doesn't remember star wars? as a young man I was amazed by the spillberg films.", 'Coding for kids with my site, Pleae, visit my website for children, learn about Lego, Raspi, etc: kidstronics -
https://medium.com/kidstronics', 'everything has a pattern - A (...)', "without a doubt, the best platform...I'll still have one - Mac Pro is designed for pros who need the ultimate in CPU performance.", 'What is the cheapest gym to join? For me here is where i do i do jog to get in shape: Espaço Alternativo - Porto Velho City - Ro - Brazil']

Computer vision!

Hi Python Computer Vision — PIL! An Intro To Python Imaging Library #PyVisionSeries — Episode #00

Please, visit this post BY CLICKING HERE: 👉 link

import numpy as np
import matplotlib.pyplot as ptl
%matplotlib inline
from PIL import Image
idx = artist_list.index('Jorge Zapata')image = Image.open('photos\electronics\jorge-zapata-j2ExxxnN_w8-unsplash.jpg')
pic_arr = np.asarray(image)
ptl.imshow(pic_arr)
print('Photo Art By: Jorge Zapata - Details:')
print('\tTheme: ',theme_list[idx])
print('\tSubject: ',subject_list[idx])
print('\tDescription: ',description_list[idx])
Fig 5. Using computer images brings new knowledge!
  • Now let’s make a program to automate what we’ve just did, okay?
'''
Now-let's-make-a-program-to-automate-what-we-just-did,-okay?
Instructions:
you can search by artist name
and see what these lovingly chosen photos mean to me.
It is a tribute to the wonderful work of these artists \o/List of Artists (cpoy/paste when running the program below):['Jorge Zapata', 'Erik Mclean', 'Robo Wunderkind', 'Thimo van Leeuwen','Felix Girault', 'Harrison Broadbent', 'Erol Ahmed', 'Greg Rakozy','MARK ADRIANE', 'fpcamp', 'victor santos', 'Marco Túlio de Miranda','Tchelo Veiga', 'Claiton Conto', 'Bruno Melo', 'Marianna Smiley','Agustin Diaz Gargiulo', 'Possessed Photography', 'Maxim Hopman', 'Silvia Mc Donald']default: Marco Túlio de Miranda'''path = Path('photos/')artist_name = input('Insert the name of the Artist: ')
idx = artist_list.index(artist_name)
if artist_name in artist_list:
artist_initial_letters = artist_name[:4].lower()
file_name = name_file_dict[artist_initial_letters]
theme = name_theme_dict[artist_initial_letters]
file_to_open = path / Path('{}/{}'.format(theme, file_name))

image = Image.open(file_to_open)
pic_arr = np.asarray(image)
ptl.imshow(pic_arr)

print(artist_name, ' Photo Art: ')
print('\tTheme: ',theme_list[idx])
print('\tSubject: ',subject_list[idx])
print('\tDescription: ',description_list[idx])
else:
print(f'{artist_name} not on the artist list :/ Try Again! :)')
Fig 6. Running the program above. See this scenario? It is Caraça! Check it out!

I hope you enjoyed that lecture o/

If you find this post helpful, please click the applause button and subscribe to the page for more articles like this one.

Until next time!

👉Jupiter notebook link :)

👉excel file link

👉or collab link

👉git

Credits & References

Hashtag Treinamentos by João Paulo Rodrigues de Lira — Thank you dude!

Photos from https://unsplash.com/

Related Posts

00#Episode#PurePythonSeries — Lambda in Python — Python Lambda Desmistification

01#Episode#PurePythonSeries — Send Email in Python — Using Jupyter Notebook — How To Send Gmail In Python

02#Episode#PurePythonSeries — Automate Your Email With Python & Outlook — How To Create An Email Trigger System in Python

03#Episode#PurePythonSeries — Manipulating Files With Python —Manage Your Lovely Photos With Python! (this one)

04#Episode#PurePythonSeries — Pandas DataFrame Advanced — A Complete Notebook Review

05#Episode#PurePythonSeries — Is This Leap Year? Python Calendar — How To Calculate If The Year Is Leap Year and How Many Days Are In The Month

Credits & References

How To Code in Python

How To Construct For Loops in Python 3 https://www.digitalocean.com/community/tutorials/how-to-construct-for-loops-in-python-3

https://funkyenglish.com/idiom-get-your-feet-wet/

--

--

J3
Jungletronics

Hi, Guys o/ I am J3! I am just a hobby-dev, playing around with Python, Django, Ruby, Rails, Lego, Arduino, Raspy, PIC, AI… Welcome! Join us!