2022/01/07
Table of contents
Python is a very, if not the most, popular programming language. According to PYPL since 2017-2018 Python has been in the top of programming language popularity. Its simple syntax makes it easy to learn, the built-in functions and types makes it easy to transform data. Universities use Python to teach programming to beginners and whenever someone wants to learn programming but doesn’t know where to start the answer is almost always Automate the Boring Stuff with Python.
No programming language is perfect, Python or atleast the reference implementation (CPython) can be slower than say compiled languages like C/C++, Rust or Go on heavier tasks BUT on simpler tasks Python definately has an advantage as the difference in execution time is negligible and development time can be greatly reduced because of the built-in functions, types and modules.
Lately I’ve been using a lot of Python to write programs/scripts and almost always ended up using a similar setup. When I start writing a new script sometimes I go back to older ones to copy part of the setup, which is why I decided to create this template that would allow me to setup some of the basics I always use so I don’t go through the problem of setting everything up again and forgetting how to format with the old %-formatting that Python has for the logging module.
Partly I was inspired to share and write this blog because of this blog I read Minimal safe Bash script template. So here is the Python template:
|
|
This template was tested with Python 3.7.12 and 3.10.1 and the idea behind this is to first, have it organized more like traditional programming languages where it starts from the main
function and not just from any line in the script, being able to log whatever is happening inside your script, this is useful while you are developing so you know where your program could be getting stuck or failing and also once you have it running and finally being able to add any argument if needed instead of having a hard-coded value that you have to change and save if you want to try different values while being able to control whether you want to show debug messages from the logger or not, which is very verbose but when you run into a bug is very useful to debug
.
|
|
This is a shebang, this is used on text files that can be executed by interpreters and tells the system which interpreter to use, in this case Python and yes Python files can have and use Shebangs! Instead of executing the script like $ python main.py
it can be executed like $ ./main.py
(After you have given it executable permissions)
|
|
Argparse allows us to easily write command-line interfaces (CLI) so we can add flags and options to our program when we execute it, further in the blog I will explain how it’s configured and how it works.
Logging allow us to configure a logger. This can give us information on what the program is doing and keep track of what it did when it runs, differentiate the messages based on the severity from informational to critical and other features this module provides.
|
|
This function setup_argparse()
sets up the basic configuration for argparse. We create an ArgumentParser
and give it a description of what our script does, then we can add arguments will allow you to add inputs to your program or options. In this case only we defined only 2 arguments, whether we want debug logs to show and display the version of our program. By default argparse will generate a --help
option to show information about our script, including description and options.
If you leave it as it is, by default, you can run your script without any flag and won’t require one to run, you can leave it like that or add arguments later.
Here is an example of the output when --help
is ran:
$ ./main.py --help
usage: main.py [-h] [-d] [-v]
This is a useful tool that does a lot of things
options:
-h, --help show this help message and exit
-d, --debug set log level to DEBUG
-v, --version show program's version number and exit
Author: Tomás Gutiérrez
|
|
Then we have setup_logging(debug)
this will allow us to configure the logging module. Instead of using print to show messages and manually adding “INFO” or “DEBUG” before every message and using print()
to debug you can use the functions logging
provides such as logging.info("Information")
or logging.error("What!")
, not only that but everytime we log it will add the date and time, the function where it’s being executed and the line number. The parameter debug
is a bool that comes from the -d
option from argparse that will enable logging.debug("debug msg")
to show on output.
I want to emphasize the importance of the logging module from Python, this is taken from the docs page:
The key benefit of having the logging API provided by a standard library module is that all Python modules can participate in logging, so your application log can include your own messages integrated with messages from third-party modules.
If you use other Python modules, you’ll likely be able to see log messages from those modules.
Here is an example when you run the script:
$ ./main.py
2022-01-02 21:57:02,734 [INFO ] (main:37) Script starting
2022-01-02 21:57:02,734 [INFO ] (main:40) Finished running script
Here with debug logs enabled:
$ ./main.py -d
2022-01-02 21:57:02,734 [INFO ] (main:37) Script starting
2022-01-02 21:57:02,734 [DEBUG ] (main:38) Debug!
2022-01-02 21:57:02,734 [INFO ] (main:40) Finished running script
|
|
Then we have our main function, this is where our main code goes, the first thing it does it parse the arguments and sets up the logging module and then the program starts. As I place holder I put logging messages. Within the same main.py
file you can create more functions and call them in main()
but the idea of this is to have a place where we know our program starts and ends.
|
|
Finally we have this.
Python has special variables, one of them being __name__
, the value of this variable will depend on how we execute the script, if we execute it explicitly (./main.py
) then its value will be '__main__'
, if we import the script into another one, with import main
and access that value with main.__name__
then the value will be the name of the script (main), here is an example:
$ ls
var.py my_script.py
$ cat var.py
import my_script
print(f"my_script.py: '{my_script.__name__}'")
print(f"var.py: '{__name__}'")
$ python var.py
my_script.py: 'my_script'
var.py: '__main__'
Most languages start from main() so we can make Python behave the same so its easier to know where the code starts, which makes it easier to read.
But then you might ask yourself, why would we want to check whether its being executed directly or its being imported if I can do the same by just calling main()
directly:
|
|
Well, there is a difference between just calling main()
and checking the value of the special variable, if at any point you want to import main.py
as a module into another .py file then this check is a must, otherwise Python when it imports a module it gets executed and so it will execute main()
, here is an example of the unintended behaviour:
$ cat my_script.py
def main():
print("Hello, I'm in my_script.py")
main()
$ cat var.py
import my_script
print("Hello, I'm in var.py")
$ python var.py
Hello, I'm in my_script.py
Hello, I'm in var.py
We defined a main function in my_script.py
that prints a string and we make a call to main()
so it executes main when is ran. Then we import that script as a module on var.py
and print another string. When we execute var.py
It will first execute my_script.py
because its whats first gets imported, and because we have a call to main()
it prints the string from my_script.py
and then execute the print from var.py
I tried to keep the template as simple as possible so anyone can easily modify it to their needs or extend the options it already has, such as configuring the logging module to write to a file or adding more arguments to argparse module.
I have created a public repository in gitlab python-template that contains the script and a README with basic instructions. Feel free to suggest improvements by sending an e-mail, by opening an issue or a merge request.