Introduction to Storing Passwords

Get introduced to the intended audience and content of this course.

Should you really be storing passwords?

I’m sorry to begin like this, but you should avoid everything I’m about to tell you about storing passwords. Authentication is hard enough, and then you still have to work out account creation, password resets, and two-factor authentication. By the time you finish this, your company’s sales team will have probably learned about Single Sign-On with Active Directory. Ultimately, you should not be storing passwords. However, if necessary, this course will teach you how to ensure you’re storing passwords as securely as possible.

If you’re starting from scratch on a simple project, look at Google Sign-In or GitHub OAuth. For more complicated projects, consider a stand-alone app, like Keycloak, or a complete service, like Auth0. The hard work has already been done; you might as well benefit from it.

widget

Who needs to know this stuff?

This course is intended for developers building a web-app that users log into, or developers who have inherited such a system. There are lots of these apps in the wild, and some of them are quite bad. You can usually tell a bad one by unnecessarily short max password lengths and pointless character restrictions. However, my goal isn’t to shame anyone, just to arm you with facts and a plan. So, if you work on an app like this, fear not. This course will explain the problems and tell you how to fix them.

How to store passwords

Before we get to bad password storage, let’s look at the good password storage. The best password hashing algorithm available today is Argon2, and the best implementation for Python is PyNaCl. You don’t have to believe me just yet. I’ll explain why throughout the course.

Below are the guts of a user authentication app using PyNaCl. This example, like all the examples in this course, is written in Python with SQLite for storage. If you’re using a different language, libsodium has bindings for several other languages.

import sqlite3
from nacl import pwhash
from nacl.exceptions import InvalidkeyError
OPS_LIMIT = pwhash.OPSLIMIT_MODERATE
MEM_LIMIT = pwhash.MEMLIMIT_MODERATE
conn = sqlite3.connect('users.db')
cursor = conn.cursor().execute('''
CREATE TABLE IF NOT EXISTS users (
username VARCHAR(16) PRIMARY KEY,
nacl_pwhash VARCHAR(100)
)
''')
def create_account(username, password):
if len(password) < 8:
raise Exception('Password too short')
cursor = conn.cursor()
cursor.execute('SELECT count(*) FROM users WHERE username=?', (username,))
result = cursor.fetchone()
if result[0] > 0:
raise Exception('Username already taken')
hashed = pwhash.str(bytes(password, 'UTF-8'),
opslimit=OPS_LIMIT,
memlimit=MEM_LIMIT,
)
cursor.execute('INSERT INTO users (username, nacl_pwhash) VALUES (?, ?)', (username, hashed))
conn.commit()
def login(username, password):
cursor = conn.cursor()
cursor.execute('SELECT nacl_pwhash FROM users WHERE username=?', (username,))
result = cursor.fetchone()
if result == None:
raise Exception('Invalid username or password')
if result == None:
# User doesn't exist. Make sure the login is still slow.
pwhash.str(b'',
opslimit=OPS_LIMIT,
memlimit=MEM_LIMIT,
)
raise Exception('Invalid username or password')
try:
pwhash.verify(result[0], bytes(password, 'UTF-8'))
except(InvalidkeyError):
raise Exception('Invalid username or password')
# Make a couple accounts for demonstration
create_account('jim', 'password')
create_account('sue', 'another-password')
try:
# Attempt to log in
login('jim', 'password')
print('Login succeeded')
except Exception as e:
print(f'Login error: {e}')

This example creates a database of users and then simulates a login. Try changing the username or password that’s passed to login and confirm that it fails. You can also simulate a password dump:

for row in conn.cursor().execute('SELECT * FROM users'):
    print(row)

That will print something like:

('jim', b'$argon2id$v=19$m=65536,t=2,p=1$vKUwc4GXTdWFxbmdfm7rew$5x6dY7g2sDquntdSQuzxL1S3mU5Wnfpr+LTTsvunrew')
('sue', b'$argon2id$v=19$m=65536,t=2,p=1$Gyshn0jEtyyrPjQS2hBP3A$4YkTNdqaBHNMv00jn4j65Z8y8AMfOrHnHEWdXEXovZE')

Note: The dumped data doesn’t contain any actual passwords. nacl.pwhash.str returns a handy string with all the arguments it needs to verify the password, which we can put right into the database. It’s incredibly convenient.

This authentication example is where we’re trying to get to, so we’ll revisit it at the end of the course.