Manipulating Files With Python

Manage Your Lovely Photos With Python! #PurePythonSeries — Episode #03

J3
Jungletronics
8 min readSep 10, 2021

--

What is file manipulation?

So, file manipulations — creating a file, removing a directory, etc. — are very common operations in Python.

Challenge — Make it Yourself!

Gif 1 . Program working, creating a directory, and manipulating the files.
Fig 1. Here is the beginning of everything!

👉️Colab nb

👉️Download xls and photos.zip

Instructions:

In Colab, sign up, sign in and , upload . ipynb file, and run the provided code will set up the required directories and extract the photos into sample_data/photos. Just execute it, and you’re all set!

Python and File Manipulation On Your Machine

Mastering the os and pathlib Modules:

The os and pathlib modules are powerful tools for managing folders and files on your computer. While other modules may offer additional capabilities depending on your specific needs, these two are essential for tackling a wide range of file manipulation tasks. This tutorial will guide you through using these modules to efficiently solve common challenges.

Our Challenge:

  • You will need to distribute photos across the directories specified in the Excel sheet (03_file_manipulation_techniques.xlsx).

Key Points:

  • We will be using pathlib because it handles file paths seamlessly across different operating systems. Whether you're on Windows, Mac, or Linux, pathlib simplifies the path differences for us.

shutil Module:

  • While you could use os and pathlib for copying and pasting files, it's more complex. Fortunately, the shutil module is here to make those tasks easier.

Let's get started!

Overview:

  • Module’s importing
from pathlib import Path
  • Listing All Files in Current Folder
files = Path.iterdir()
  • Copying a File
import shutil
shutil.copy2('file_to_copy.extension', 'name_of_copied_created.extension')
  • Moving a File; 2 methods:
Path('path/file.extension').rename('new_path/file.extension')

or

shutil.move(Path('path/file.extension'), Path('new_path/file.extension'))

1Step — Listing All Files in the Photos Folder:

Let’s start by diving in and listing all the files in the /photos directory.

from pathlib import Path

#print(Path.cwd())

path = Path('sample_data/photos')

files = path.iterdir()
for file in files:
print(file)

2 Step — Creating two Lists to convert them into a dictionary later:

'''
Take the first 4 letters and store them in a dictionary along with the filename:

'jorg':jorge-zapata-j2ExxxnN_w8-unsplash.jpg

That way, later, we can map the first 4 digits with the image name .jpg photos\agustin-diaz-gargiulo-GTLJklnjn-E-unsplash.jpg

'''
name_key_list = []
file_value_list = []

import os
from pathlib import Path

# print(Path.cwd())

path = Path('sample_data/photos')


files = path.iterdir()
for file in files:
file_name = os.path.basename(file)
file_value_list.append(file_name)
artist_initial_letters = file_name[:4]
name_key_list.append(artist_initial_letters)
# here are the two list to convert to a dictionary
print(name_key_list)
print('\n')
print(file_value_list)

3 Step— Converting the Two List into a Dictionary:

name_file_dict = dict(zip(name_key_list , file_value_list))
print(name_file_dict)

4 Step — Testing the Dictionary (name_file_dict):

name_file_dict['fpca']

5 Step — Now, let’s check if a file, we’re looking for, exists in the folder:

if (path / Path('bruno-melo-XsAv0ItdT5w-unsplash.jpg')).exists():
print('Yes, there is a file with that name \o/')

6 Step — Creating a new folder (/organized):

Path('sample_data/photos/organized').mkdir()
Fig 2. Creating the /organized directory and copying .jpg files to it. The Victoria amazonica boasts enormous leaves, up to 3 meters (10 feet) in diameter, that float on the water’s surface, supported by a submerged stalk that can reach 7–8 meters (23–26 feet) in length.

7 Step —Copying our file into the newly created /organized folder:

import shutil

file_to_copy = Path(r'sample_data/photos/bruno-melo-XsAv0ItdT5w-unsplash.jpg')
file_to_paste = Path(r'sample_data/photos/organized/bruno-melo-XsAv0ItdT5w-unsplash.jpg')

shutil.copy2(file_to_copy, file_to_paste)

8 Step — Moving a file from /organized to /organized/cities:

Path('sample_data/photos/organized/cities').mkdir()

shutil.move(Path(r'sample_data/photos/organized/bruno-melo-XsAv0ItdT5w-unsplash.jpg'), Path(r'sample_data/photos/organized/cities/bruno-melo-XsAv0ItdT5w-unsplash.jpg'))

Here’s the improved version:

from pathlib import Path
import shutil

# Define the source and destination paths
source_file = Path('sample_data/photos/organized/bruno-melo-XsAv0ItdT5w-unsplash.jpg')
destination_dir = Path('sample_data/photos/organized/cities')
destination_file = destination_dir / source_file.name

# Create the destination directory if it doesn't exist
destination_dir.mkdir(parents=True, exist_ok=True)

try:
# Move the file to the destination directory
shutil.move(str(source_file), str(destination_file))
print(f"File moved to {destination_file}")
except Exception as e:
print(f"An error occurred: {e}")

We’re doing great so far. Now, let’s tackle a real challenge and put our knowledge to the test.

— — — — Challenge — Make it Yourself! — — — —

  • electronics
  • inspiration
  • cities
  • nature
  • trips
  • university
Fig 3. Make a directory for each theme.
For example, Zapata's photo should be saved in the \electronics folder. 
This ensures a well-organized photo album, right?
Note 1:
To extract a file name as text using pathlib, use Path.name or file.name:
path = Path('Folder/File.csv')
print(path.name) # Output: 'File.csv'
Note 2:
Here's my suggestion: since you'll need to loop over Excel files and lists,
consider creating two dictionaries—name_file_dict and name_theme_dict.
Use the artist's 4-letter code as the key,
and the file name and theme (directories) as the values.

— — — — — — — — — — — — — — — — — — —

1 Challenge_Step — Let’s initialize by reading the excel file with Pandas:

import pandas as pd

themes_df = pd.read_excel('sample_data/03_file_manipulation_techniques.xlsx')
themes_df.info()

2 Challenge_Step —Creating two lists to later convert into a dictionary:

  • Keys: The first four letters of each artist’s name.
  • Values: The theme of the artist’s photo, which will be used as directory names.
key_list = []
value_list = []

for i, artist in enumerate(themes_df['artist']):
artist = themes_df.loc[i, 'artist']
key_list.append(artist[:4].lower())

theme = themes_df.loc[i, 'theme']
value_list.append(theme)

subject = themes_df.loc[i, 'subject']
description = themes_df.loc[i, 'description']
# print(f'(artist)-> {artist} (Theme)-> {theme}')

# here are the two list to convert to a dictionary
print(key_list)
print('\n')
print(value_list)

3 Challenge_Step— Converting Two Lists into a Dictionary:

name_theme_dict = dict(zip(key_list , value_list))
print(name_theme_dict)

4 Challenge_Step— Testing the Dictionary:

name_theme_dict['fpca']

5 Challenge_Step — Creating 6 Directories (6 Themes):

from pathlib import Path
import shutil

themes = ['electronics', 'inspiration', 'cities', 'nature', 'trips', 'university']
for theme in themes:
#print(theme)
Path('sample_data/photos/{}'.format(theme)).mkdir()
Fig 4. Directories Just created! Now, transfer the photos into each one (see MS Excel sheet)

6 Challenge_Final_Step —Loop through /photos and check the dictionary for the initials of the four artists:

path = Path('sample_data/photos/')
files = path.iterdir()

for file in files:
file_name = file.name
if file_name[-3:] == 'jpg':
artist_initial_letters = file_name[:4]
#print(artist_initial_letters)
theme = name_theme_dict[artist_initial_letters]
#print(theme)
final_place = path / Path('{}/{}'.format(theme, file_name))
shutil.move(file, final_place)

Now let’s have some fun! — PANDAS Review!

Let’s Make Four Lists

from the PANDAS DATAFRAME made in Challenge_Step_1☝️

artist_list = []
theme_list = []
subject_list = []
description_list = []

for i, artist in enumerate(themes_df['artist']):
artist = themes_df.loc[i, 'artist']
artist_list.append(artist)

theme = themes_df.loc[i, 'theme']
theme_list.append(theme)

subject = themes_df.loc[i, 'subject']
subject_list.append(subject)

description = themes_df.loc[i, 'description']
description_list.append(description)

print(artist_list)
print('\n')
print(theme_list)
print('\n')
print(subject_list)
print('\n')
print(description_list)

Computer vision!

Hi Python Computer Vision — PIL! An Intro To Python Imaging Library #PyVisionSeries — Episode #00

Please, visit this post BY CLICKING HERE: 👉 link

import numpy as np
import matplotlib.pyplot as ptl
%matplotlib inline
from PIL import Image



i = artist_list.index('Jorge Zapata')

image = Image.open('sample_data/photos/electronics/jorge-zapata-j2ExxxnN_w8-unsplash.jpg')
pic_arr = np.asarray(image)
ptl.imshow(pic_arr)

print('Photo Art By: Jorge Zapata - Details:')
print('\tTheme: ',theme_list[i])
print('\tSubject: ',subject_list[i])
print('\tDescription: ',description_list[i])
Fig 5. Using computer images brings new knowledge!
  • Now let’s make a program to automate what we’ve just did, okay?
'''http://localhost:8888/notebooks/Documents/2021/python/03_tecnicas_manipulacao_arquivos/Python_File_Manipulation/Python_File_Manipulation/03_file_manipulation_techniques.ipynb#Now-let's-make-a-program-to-automate-what-we-just-did,-okay?
Instructions:

you can search by artist name
and see what these lovingly chosen photos mean to me.

It is a tribute to the wonderful work of these artists \o/

List of Artists (cpoy/paste when running the program below):

['Jorge Zapata', 'Erik Mclean', 'Robo Wunderkind', 'Thimo van Leeuwen',
'Felix Girault', 'Harrison Broadbent', 'Erol Ahmed', 'Greg Rakozy',
'MARK ADRIANE', 'fpcamp', 'victor santos', 'Marco Túlio de Miranda',
'Tchelo Veiga', 'Claiton Conto', 'Bruno Melo', 'Marianna Smiley',
'Agustin Diaz Gargiulo', 'Possessed Photography', 'Maxim Hopman',
'Silvia Mc Donald']

default: Marco Túlio de Miranda

'''

path = Path('sample_data/photos/')

artist_name = input('Insert the name of the Artist: ')
idx = artist_list.index(artist_name)

if artist_name in artist_list:
artist_initial_letters = artist_name[:4].lower()
file_name = name_file_dict[artist_initial_letters]
theme = name_theme_dict[artist_initial_letters]
file_to_open = path / Path('{}/{}'.format(theme, file_name))

image = Image.open(file_to_open)
pic_arr = np.asarray(image)
ptl.imshow(pic_arr)

print(artist_name, ' Photo Art: ')
print('\tTheme: ',theme_list[idx])
print('\tSubject: ',subject_list[idx])
print('\tDescription: ',description_list[idx])
else:
print(f'{artist_name} not on the artist list :/ Try Again! :)')
Fig 6.Running the program above? Check out this scenario — it’s Caraça!

I hope you enjoyed the lecture! 🎉

If you found this post helpful, please click the applause button and subscribe for more content like this.

See you next time!

👉Jupiter notebook link :)

👉excel file link

👉or collab link

👉git

Credits & References

Hashtag Treinamentos by João Paulo Rodrigues de Lira — Thank you dude!

Photos from https://unsplash.com/

Related Posts

00#Episode#PurePythonSeries — Lambda in Python — Python Lambda Desmistification

01#Episode#PurePythonSeries — Send Email in Python — Using Jupyter Notebook — How To Send Gmail In Python

02#Episode#PurePythonSeries — Automate Your Email With Python & Outlook — How To Create An Email Trigger System in Python

03#Episode#PurePythonSeries — Manipulating Files With Python — Manage Your Lovely Photos With Python! (this one)

04#Episode#PurePythonSeries — Pandas DataFrame Advanced — A Complete Notebook Review

05#Episode#PurePythonSeries — Is This Leap Year? Python Calendar — How To Calculate If The Year Is Leap Year and How Many Days Are In The Month

06#Episode#PurePythonSeries — List Comprehension In Python — Locked-in Secrets About List Comprehension

07#Episode#PurePythonSeries — Graphs — In Python — Extremely Simple Algorithms in Python

08#Episode#PurePythonSeries — Decorator in Python — How To Simplifying Your Code And Boost Your Function

10#Episode#PurePythonSeries — CS50 — A Taste of Python — Harvard Mario’s Challenge Solver \o/

11#Episode#PurePythonSeries — Python — Send Email Using SMTP — Send Mail To Any Internet Machine (SMTP or ESMTP)

12#Episode#PurePythonSeries — Advanced Python Technologies qrcode, Speech Recognition in Python, Google Speech Recognition

13#Episode#PurePythonSeries — Advanced Python Technologies II — qFace Recognition w/ Jupyter Notebook & Ubuntu

14#Episode#PurePythonSeries — Advanced Python Technologies III — Face Recognition w/ Colab

15#Episode#PurePythonSeries — ISS Tracking Project — Get an Email alert when International Space Station (ISS) is above of us in the sky, at night

16#Episode#PurePythonSeries — Using Gemini Chat on Collab — Random Number Generation, List Manipulation & Rock-Paper-Scissors Game Implementations

17#Episode#PurePythonSeries — Python — Basics — Functions, OOP, file handling, calculator, loops

18#Episode#PurePythonSeries — Python — Efficient File Handling in Python — Best Practices and Common Methods

Credits & References

How To Code in Python

How To Construct For Loops in Python 3 https://www.digitalocean.com/community/tutorials/how-to-construct-for-loops-in-python-3

Reviewed in August 2024

--

--

J3
Jungletronics

😎 Gilberto Oliveira Jr | 🖥️ Computer Engineer | 🐍 Python | 🧩 C | 💎 Rails | 🤖 AI & IoT | ✍️