Hamburger Icon
  • Labs icon Lab
  • Core Tech
Labs

Guided: Build a CLI File Organizer in Python

In this Code Lab, you will organize diverse company data—images, audios, videos, and scripts—using the CLI. You will create distinct directories, sort files, address duplicates, archive old files, and apply encryption/decryption for data security. By the end of this lab, you will have enhanced your skills in organizing varies data formats.

Labs

Path Info

Level
Clock icon Intermediate
Duration
Clock icon 56m
Published
Clock icon Dec 13, 2023

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Table of Contents

  1. Challenge

    Introduction

    Hello, fellow managers! Your primary goal in this lab is to enhance data management, streamline access, and ensure the highest level of data security by building a CLI File Organizer in Python.

    Lab Scope

    You will be working with various data types:

    1. Images (.PNG, .JPG, and .GIF)
    2. Audios (.M4A, and .MP3)
    3. Videos (.MP4)
    4. Scripts (.JSON)

    Throughout this lab, you will complete steps the following steps to add functionality to your CLI File Organizer:

    1. Create sub-directories based on file types.
    2. Sort files by their sizes.
    3. Handle duplicate file names.
    4. Archive files that are older than one year.
    5. Apply encryption/decryption to secure the files.

    Lab Structure

    You are working in the /home/ps-user/workspace directory, where you have four directories:

    1. src directory holds five Python scripts: starter.py, images.py, audios.py, videos.py, and scripts.py (one for each step).
    2. static directory holds all types of files that you must use while completing each step.
    3. data directory holds different files with the same data types. You must first validate all the tests and later use the Terminal to organize the data directory files by running the python3 organize.py command or using the Run button.
    4. solutions directory holds the completed code for each of the five files. You can view it at any time if you need further assistance to complete a task.

  2. Challenge

    Create Sub-directories

    Your first step is to categorize distinct files into their relevant sub-directories - images, audios, videos, and scripts. For this step, you will be working in the src/starter.py file. info> DID YOU KNOW?
    You can use the usual for loop to generate the same content instead of using a list comprehension.
    info> DID YOU KNOW?
    If you have to run the os.makedirs() command and are unsure whether the directories already exist on the disk, you can use the exist_ok=True argument in the os.makedirs() command. This will leave existing directories unaltered and will not raise any errors or create new directories.
    info> DID YOU KNOW?
    You can create an if/elif statement without including the else statement.
    Once you have completed the tasks above, you would have created four sub-directories and moved all files to their respective data type directories. Inside file tree, you must see the following static directory structure:

    static

    audios

    1. base.m4a
    2. base.mp3
    3. core.m4a
    4. core.mp3
    5. evening.m4a

    images

    1. scarlet.gif
    2. spring.jpg
    3. sunset.jpg
    4. vintage.png

    scripts

    1. flour.json
    2. fruit.json
    3. vegetable.json

    videos

    1. cheesy.mp4
    2. soy.mp4
    3. windy.mp4
    4. sizzler.mp4
  3. Challenge

    Sort Image Files According to their Size

    The next step is to sort images in the static/images directory as per their file size in an ascending order. Given below is a bar plot that shows the available files and their sizes:

    | Current Layout | Expected Layout | | -------- | -------- | | scarlet.gif | 1_scarlet.gif | | spring.gif | 2_spring.gif | | sunset.gif | 3_vintage.png | | vintage.png | 4_sunset.gif |

    For this step, you will work in the src/images.py file. You will start by retrieving the size of each file, sorting them according to their sizes, and finally performing the action within the directory. info> TRY IT OUT!
    Rerun this task and print the output of this function. Observe how the list elements do not maintain an order. This is due to the os.listdir() command which follows no order while fetching files.
    info> DID YOU KNOW?
    If you don't want to use the sorted() function, you can always go with the sort() method. It also accepts a key argument just like the sorted() function.
    After completing these three steps, you now have arrived at the expected layout inside the static/images directory.

  4. Challenge

    Handle Duplicate Audio File Names

    In this step, you will switch to the src/audios.py file and will work within the static/audios directory. Here, you have to rename files based on the following format:

    If a file has a unique name, suffix _0.

    If multiple files have similar names, suffix _0, _1, _2, etc.

    The duplicate name frequency of each file is depicted in the given figure:

    A pie chart titled "Duplicate instances of audio files" with three segments. The "base" segment is light orange and constitutes 40% (2 instances). The "core" segment is dark orange and constitutes another 40% (2 instances). The "evening" segment is red and constitutes 20% (1 instance).

    | Current Layout | Expected Layout | | -------- | -------- | | base.m4a | base_0.m4a | | base.mp3 | base_1 .mp3 | | core.m4a | core_0.m4a | | core.mp3 | core_1 .mp3 | | evening.m4a | evening_0.m4a |

    The script already has file names and their extensions saved in the variables aud_files and aud_exts. info> DID YOU KNOW?
    You can use pandas to complete the above task. You can use the pivot_table() function from the pandas package to achieve this task as well. You must set the value of the aggfunc argument to size.
    After completing these two tasks, you have arrived at the expected layout inside the static/audios directory.

  5. Challenge

    Archive Video Files Older than One Year

    You have successfully completed half of the lab!

    Next, you must archive video files that are older than one year. To do so, you must first extract the last modified time and subtract it from the current time. If the file is older than 365 days, you compress it using the zipfile package.

    The image depicts the last modified time of each video file:

    A bar chart titled "Last modified year of each file" with the x-axis labeled "Year" and the y-axis labeled "Number of files." There are two bars: one for the year 2000, colored orange and labeled "windy" with 1 file, and one for the year 2023, colored red and labeled "cheesy, sizzler, soy" with 3 files.

    You will be working in the src/videos.py file and the static/videos directory. The videos.py file already has a variable paths that holds complete path links to all videos. info> DID YOU KNOW?
    A file is associated with three timestamps—access, modify, and create. In this task, you are using the last file modification time. To use other timestamps, you can rely on os.path.getatime() and os.path.getctime() commands.
    info> DID YOU KNOW
    Python offers different modules for compression, employing algorithms such as gzip, bzip2, lzma, and more. Comprehensive information about these algorithms can be found here.

    info> FOOD FOR THOUGHT
    Have you wondered what would happen if you open the file without using the with statement? Hint: Try to read the file content outside the with statement.
    After successfully completing the above two tasks, you should be seeing the comp_vids.zip directory inside the static/videos directory.

    If you run this lab before November 25, 2024, the zip folder will hold only the windy.mp4 file. However, if you run this lab after the above date, you will end up compressing all the files.

  6. Challenge

    Encrypt and Decrypt Scripts

    In the last step of this lab, you will be performing encryption and decryption of multiple files lying inside the static/scripts directory using the src/scripts.py file.

    The script uses the Fernet module of the cryptography package to perform this step. It already has a function named load_key() which loads and returns the key available at /workspace/mykey.key.

    You will be using this key to encrypt and decrypt all three JSON files.

    Before starting with the encryption process, open one of the three files and examine the data to get a sense of its appearance. info> FOOD FOR THOUGHT
    1. What would happen if you do not use the value b while reading (rb) and writing (wb) files? Hint: Run the function and check its response.
    2. Would it be possible to read files in their alphabetical sorted order before encrypting them? Hint: sorted() function.
    3. Can you create your own key instead of using the provided mykey.key token? Hint: Fernet.generate_key() function.


    info> FOOD FOR THOUGHT
    1. What would happen if you keep the mykey.key file inside the static/scripts directory before encrypting the directory files? Hint: Are you able to decrypt your files? If not, check the content of mykey.key file. Is the token same as it was before encryption?
    2. What would happen if you do not store your key in the disk and initiate the encrypt_files() and decrypt_files() functions in separate runs? Hint: Check the key in both runs. Is it the same?

    info> DID YOU KNOW?
    You have the flexibility to run the encrypt_files() function multiple times, and each run will generate a distinct set of encrypted data. For decryption and retrieval of the original data, it is crucial to run the decrypt_files() function an equal number of times as the encryption function.
    Bravo! You have successfully organized your static directory and its distinct data types.

    Using Python CLI File Organizer

    You are now ready to apply these functions on new data using the Terminal. Observe the data directory and the organize.py file in the filetree.

    Run the organize.py file in the Terminal and follow the menu's steps to organize the data files.

Written content author.

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.