Types of underscores
There are various possibilities to encounter or use underscores in Python, and they seem confusing if you aren't familiar with the conventions and concepts they are used in. There are basically these five types of underscores you might face:
- simple underscore: _
- leading underscore: _x
- subsequent underscore: x_
- double leading underscore: __x
- double leading and subsequent underscore: __x__
The simple underscore in Python
The simple underscore is often used as a variable when someone wants to indicate that the variable doesn't matter.
for _ in range(42):print("The meaning of life")
As you see the variable is useless here, I just use it to print "the meaning of life" 42 times.
Another usage for the simple underscore in the Python REPL (read evaluate print loop) is to print the last evaluated expression:
In[1]: a = 42In[2]: b = 1337In[3]: a*bOut[3]: 56154In[4]: _Out[4]: 56154
The leading underscore in Python
The leading underscore indicates that the following variable should be in the private scope of a class. However, Python does not have a concept of visibility, this is just a convention. When someone uses a leading underscore, all he or she want's to say is that this should not be visible from outside. If you have written in Java or another language that comes with visibility concepts you should have come along this concept yet. For those starting with Python this seems odd.
class PublicClass:def __init__(self, attribute):self.attribute = attributeself._private_attribute = "Initialized" # I am privatedef get_private_attribute(self):return _private_attributedef set_private_attribute(self, value):value += "I am statically appended"# Do something fancyself._private_attribute = value
So, looking at this method, the underscore indicates that the _private_attribute is private and should not be accessed directly. In contrast to the current example: If there are also no methods to interact with the attribute, this usually indicates that this variable should only be visible to the inner context, hence the class, method or whatever.
Subsequent underscore in Python
The subsequent underscore merely indicates that the original name is conflicting with a reserved keyword. For example if the input of my function might be lambda, the interpreter will return a syntax error as it conflicts:
def function(lambda):pass
If I simply use an underscore after lambda, the interpreter will accept it again and I am still able to use my meaningful name:
def function(lambda_):pass
The double leading underscores in python
This is the first underscore in Python we covered so far, that isn't a bare convention. When you add double leading underscores to a variable the so called name mangling will be applied. Name mangling instructs the compiler to protect the variable from being overwritten by sub classes. A short example:
class MyClass:def __init__(self):self.__MyAttribute = 42
If we now create an instance of MyClass:
In[1]: from underscores import MyClassIn[2]: myclass = MyClass()In[3]: dir(myclass)Out[3]: ['_MyClass__MyAttribute', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__']
We see that our class now has a weirdly named attribute _MyClass__MyAttribute and as you see below, you don't may access it with the name you gave it in your constructor.
In[1]: myclass.__MyAttributeOut[1]: Traceback (most recent call last):File "<input>", line 1, in <module>AttributeError: 'MyClass' object has no attribute '__MyAttribute'In[2]: myclass._MyClass__MyAttributeOut[2]: 42
To access the attribute you instead have to use the class name with a leading underscore. This prevents attributes from being overwritten from sub classes as the name becomes more unique. In your class itself, you may still use __MyAttribute instead of the shown name. Nevertheless as you may assume, this isn't a guarantee. If you name your variable _MyClass__MyAttribute you could still overwrite it (but I can absolutely not imagine why you would do this)
Double leading and subsequent underscores in Python
Finally we get to the most common example of underscores in Python. If you look sharply you notice that we got nearly a ton of those up there when we printed the built-in function dir of our class. So why didn't I just explain the double leading and trailing underscores first? Because name mangling doesn't apply here! And the only reason is, that trailing double underscores disable the name mangling feature.
What are those double leading and subsequent underscores now? In short: they indicate built-in methods and you shouldn't name your functions with double leading and trailing underscores, as this might result into overwriting Python core functions unintentionally. However there are some cases in which you especially want to override those functions deliberately, which is allowed of course. For some constructs like context managers it is even necessary to override certain build-in functions.
If you're deep into Python you might also encounter the term dunder which is short for "double under". So we would pronounce our class as "dunder MyAttribute". But note that we would also pronounce "__init__" as dunder init not "dunder init dunder".
This concludes my article on underscores in Python. I hope you enjoyed reading and didn't miss anything. Whatever your impression of my post was, I would love to hear your feedback!