Automating Scripts with Python
Build a script that extracts file information and stores it in a list of dictionaries.
Python is a high-level, interpreted, and general-purpose programming language that has become one of the most popular in 2023. It’s simplicity, readability, and versatility make it a great language for beginners learning to code. Also, it’s a great language for writing scripts to automate various tasks. Scripting is the process of writing code to automate repetitive and mundane tasks. At the enterprise level, this helps free up time and increases organizational efficiency.
Python has a vast standard library that provides a wide range of modules for various tasks, such as file handling, network programming, and more. Python is platform agnostic, meaning that the same code can run on multiple platforms, such as Windows, Linux, or MacOS. Python is a great choice for cross-platform scripting.
In this tutorial, we will cover how to build and run the script remotely to extract the necessary files requested in the scenario.
Scenario: Your company needs to learn about the files located on various machines. You have been asked to build a script that extracts information such as the name and size about the files in the current working directory and stores it in a list of dictionaries. Create a script that generates a list of dictionaries about files in the working directory. Then print the list.
Prerequisites:
- AWS Cloud9 Environment or a Integrated Development Environment (IDE) for Python.
- Basic Python Knowledge
- How to “Commit” in GitHub
- Basic Scripting
Create Your AWS Cloud9 Environment.
So let’s go ahead and get started. You can use a Python IDE or use the AWS Management Console. I will be using AWS. Log into your AWS Cloud9 environment. I have placed the instructions here: AWS Cloud9 Environment. AWS Cloud9 offers rich-code editing experience, and it supports several different programming languages. We will also be connecting our GitHub repository and work within a repo, and extract some existing files within GitHub.
Once your in the Cloud9 IDE, create and save a file inside your repo file. My working folder that I have cloned from GitHub, will be called “red_python_luit_projects.” You can use the “git clone” to import your cloned repository.
git clone -b <Branch name><Repository URL>
Now that we have our cloned repository, we can create our working python template. My working template and script within the Cloud9 IDE will be called, “pythonprojectw13.py.”
Below is our python script that will eventually run in our CLI.
In the CLI, we need to change into the directory that we placed our cloned GitHub repository into. Type “ls” to to list your existing files or “ls -a” (including hidden files).
I’m going to change directory to the “red_python_luit_projects,” and type “ls.”
We are in folder location. Lets make our python file executable by running a “chmod” command.
chmod u+x <python_file_name>
My command is “chmod u+x pythonprojectw13.py.”
Create Script
Now we can move to our script. It’s already on the “pythonprojectw13.py” template. Within template you can incorporate notes using the shebang line at top of your code (Optional).
#!/usr/bin/env python3.7
Import Modules
Will start the code off with the “Import OS” to import the necessary OS modules. Import is a python command that brings functions within a public library into your script. This will allow us to use operating system dependent functionality, and can save time hardcoding generic functions that are used. We will be able to access files and directories on our computer, needed for the script.
import os
Create an Empty List
the “files = []” command creates an empty list named “files,” which is where the dictionaries containing our file information will be stored.
files = []
The “for” command is used as a loop over a sequence (list or range).The “os-listdir” input is used to retrieve a list of all files in current directory. If directory is not specified, the list of files and directories in current working directory will be returned. Each iteration contains a new dictionary named “file_info,” with two key-value pairs.
- Name: file_name, where “file_name” is the name of current iteration
- Size: os.path.getsize(file_name), where “os.path.getsize” is used to retrieve the size of the file in bytes.
The APPEND command combines records from two or more items by appending them. In Python, it’s essential to add a single item at the end of a list, array, or collection types and data structures on the go. In the case below, the “files.append” is adding dictionary “file_info” into the list of “files.”
files.append(file_info)
Create Dictionary Within a List that Includes File Name and Size
The “os.listdir()” function will return a list of names within the given directory. We will use a loop to iterate through the “os.listdir” output. Below is our code.
for item in os.listdir():
(<variable_name> = os.stat(item)
<list_name> .append({'path':path+'/'+item, 'size':<variable_name> .st_size})
To complete our script, we will use another “for” loop to iterate over the list of dictionary files. Like the information prior, we will print the “Name” and “Size” key.
Run The Script
We will be using the full script located in the pythonprojectw13.py.” We made the file an executable and placed it within the working directory.
Next, we can type the following command to see results from our script.
python [filename.py]
Our Script template
We have just finished scripting with python. Thanks for following along. If this helped you in any way, feel free to drop a comment.