DevelopmentTech

How to get a file extension in Python?

A file extension identifies the format of a file and is placed after the filename, for example, abc.txtprog.py, where .txt indicates a text file and .py indicates a Python file. Knowing the file extension can sometimes become essential in programming projects. Different programming languages provide some functions that can help you perform operations on the OS file path.

Can’t do your python assignment because you can’t find the file extension of a file you created or used in the program? In this quick tutorial, you will learn how to extract the file extension using Python’s built-in features. There are two module methods available for getting the file extension from the file path.



If you get homework to find file extension in Python and you have no clue on how you can achieve that, then this article is for you.

1. Using os.path Module to get the file in Python

The os.path module consists of useful functions suitable for the Operating System (OS) you’re running your Python program on. Using these functions, you can open, close, update and get information on OS file paths. 

This module contains a splittext() function, which separates the root and extension from the file path. Using this function, we obtain the tuple values of the two variables in the form of a string. 

root – returns the parent directory and filename (root) of the path

extension – returns the extension of the path

Let’s consider an example that can help you understand how you can apply this function to your project. Suppose a file with the following path:

/Users/user/Documents/sampledoc.docx

To get the .docx extension from this path, import the module, declare the root and extension variables, and assign the values with the os.path.splittext(file path).

Code:

import os
path = '/Users/user/Documents/sampledoc.docx'
root, extension = os.path.splitext(path) print('Root:', root)
print('extension:', extension)

Print the values and the extracted extension will be printed in the output as: Root:

/Users/user/Documents/sampledoc
Extension: .docx

Now you have the file extension separated! You can also retrieve the file path again by putting the root and extension and printing it together. 

Using the above code provides you the extension along with the dot. If you want to remove the dot before the extension, use the following code:

import os.path
my_path = r'path where the file is stored\file name.file extension'
ext = os.path.splitext(my_path)[1][1:]
print(ext)

The output of this file will contain the text of the extension without the “.” separator. In this case, it will be “docx”. It is preferred that you use the splittext() method when the OS module is already imported and being used.

2. Using the pathlib module in Python

Python provides more than one solution to extract file extension from the file path so that you can do your homework easily. The pathlib module contains various classes representing system file paths supporting different operating systems. It comprises utility functions that you can use to get the root, file name, and file extension from a file path.

pathlib.Path() – takes the file path as a string argument and returns a new Path object.

It contains the following attributes:

parent – returns the parent directory of the path

name – returns the filename along with the extension of the path

suffix – returns the extension of the path

This is how you can implement the pathlib attributes:

import pathlib
path = pathlib.Path('/Users/user/Documents/sampledoc.docx')
print('Parent:', path.parent)
print('Filename:', path.name)print('Extension:', path.suffix)

Output:

Parent: /Users/user/Documents
Filename: sampledoc.docx
Extension: .docx

There might be files with multiple extensions such as .tar.tz. The suffix attribute provides only the singleton extension, and you might lose both the suffixes of the file path. If your homework assignment asks you to find all the extensions of the file path, then you need to use a different method. Luckily, the pathlib.Path provides the suffixes attribute, which lists all the suffixes of the given file. 

Use the code below:

import pathlib
path = pathlib.Path('/Users/user/Documents/app_sample.tar.gz')
print('Parent:', path.parent)print('Filename:', path.name)print('Extension:',''.join(path.suffixes))

Output:

import pathlib
path = pathlib.Path('/Users/user/Documents/app_sample.tar.gz')
print('Parent:',path.parent)print('Filename:', path.name)
print('Extension:',''.join(path.suffixes))

You can see that using the suffixes attribute, both of the extensions were printed. You can also store the extension value in a separate string variable so you can easily use it again without the need of calling the attributes again and again. The pathlib module is preferred to use when you have an object-oriented approach to your program. Moreover, it comes in handy when you are finding a way that could get you both single and multiple suffixes from the file path.

Conclusion

Programming can be fun if you know how to implement the functions correctly! There are many programming solutions available online, but they can sometimes be confusing. We are here to make learning easy for you.

We have combined some of the easiest ways in which you can retrieve the suffix of any file path on your OS. The splittext() function of the os.path module is the standard method; however, if you want to make an object-oriented program, then using pathlib module is the best. We hope this helps you complete your task!

Learn more from technology and development like we have resolved WordPress internal server error 500.

John Harper

#1 File Information bestselling author John Harper loves to dispel the myth that smart men & women don’t read (or write) romance, and if you watch reruns of the game show The Weakest Link you might just catch him winning the $77,000 jackpot. In 2021, Netflix will premiere Bridgerton, based on his popular series of novels about the Why Files.

Related Articles

Back to top button