How To Setup a JupyterHub

6 minute read

Published:

This tutorial only stand for Linux/UNIX users, as jupyterhun only suport the former1

In there you will find out how to setup a jupyterhub using google authentification to allow only selected users. The section covers here are:

This tutorial is relatively long becasue I go into a lot of details. However, setting a jupyterhub is super easy and super fast. It took me about 10min to setup the last one I put together.

Setting up all the user accounts

Usually you would like to make sure that the security of your jupytherhub system is secure such that no other users than the one approved can login. In particular with jupyter-like server which can access terminal. Therefore it is important to emphasize that unwanted user could in theory take full control of your machine.

To this extend, using google oauth2 authentification is a convenient and secure way to identify user.

Another trick found while seting up our own jupyterhub server is - in adition of the google oauth2 - only allow user that exist on the machine. If the user doesn’t exist yet the google identification go through, this would yield an error blocking the access to the server. It is not super pretty, but this work.

Therefore, before starting setting up our server, let’s create our users and their home directory. For linux users, You can ue the script create_users.sh below which automatically creates all the users:

#! /bin/bash

for uu in <list of users>
do
  adduser -m ${uu}
  echo "${uu}:$1" | chpasswd
  mkdir /home/${uu}/AmunMount
  chown ${uu} /home/${uu}/AmunMount
done

Make the script executable and run sudo ./create_users.sh <passwrod>, where you replace <password> with the password you want for the users. This password won’t be used further. In the script change the __ to a list of username only separated by a space (no comma, no parantehsis or brackets, etc...). **The user name need to meet the @domain.com format of the google user you will whitelist to the server**

Installing Jupyter, JupyterLab, JupyterHub

Let’s get to the core of this tutorial: installing the server! Most of this part has been taken out from the jupyterhub documentation1. Pretty straightforward, install jupyter, jupyterlab and jupyter hub using your favorite python pacakge/environement manager

pip install jupyter notebook jupyterlab jupyterhub 

or

pip3 install jupyter notebook jupyterlab jupyterhub 

JupyterHub relies on http proxy to tie the jupyter spanwer to the hub. To install it, first install nodejs

sudo apt-get install nodejs npm

if on debian-based distro, and if on fedora (and if you are not, just switch to it!)

sudo dnf install nodejs npm

Then install the http-proxy

npm install -g configurable-http-proxy

And that’s all! You have just installed jupyterhub that allow you to serve jupyter or jupyterlab for different users! Pretty neat uh?

You can always check if everything is alright:

jupyterhub -h
configurable-http-proxy -h

Setting up JupyterHub to use Google OAuth

Ok so far we have done half of the job. Now we need to setup the jupyterhub server such that it allows some users and refuse some other.

Usually I tend to put my custom apps and config in the /opt folder, so if you prefer to do otherwise, change the basepath accordingly.

To start let’s install the python package for authentication

pip install oauthenticator

First, let’s create the folder where the config, cookie and database will be stored and create the configuration file

sudo mkdir /opt/jupyterhub
sudo chmod 775 /opt/jupyterhub
touch /opt/jupyterhub_config.py

The configuration file for jupyterhub is a regular python file that can be completely customized for a lot of things2, and an full example could be generated with jupyterhub --generate-config, yet we will not cover this here. Instead I will guide through what my setup is and minimum configuration for google oauth.

The basic start of the file is:

import os
from oauthenticator.google import GoogleOAuthenticator
c = get_config()
c.JupyterHub.log_level = 10

Let’s setup some few cookies and location for jupyterhub. Here we will also set the url and port for the http proxy that the jub use to connect to the individual jupyterhub. This port will be needed when we set the daemon. We also set another security level which is to delete any invalid user, never too cautious (although not sure it does anything)

c.JupyterHub.cookie_secret_file = '/opt/jupyterhub/jupyterhub_cookie_secret'
c.ConfigurableHTTPProxy.auth_token = '/opt/jupyterhub/proxy_auth_token'
c.ConfigurableHTTPProxy.api_url = 'http://localhost:5432'
c.Authenticator.delete_invalid_users = True

We can then import the GoogleOAuthenticator and start to set it up. Here we precise <domain_name> being the domain after the @. For instance, my email being gmoille@umd.edu, here I would replace it with umd.edu

c.JupyterHub.authenticator_class = GoogleOAuthenticator
c.GoogleOAuthenticator.hosted_domain = ['<domain_name>']

Then you need to add the whitelist of users. Here you will need to append, for instance the following one where I put myself as an administrator:

c.Authenticator.whitelist.users = {'gmoille',  'kartiks'}
c.Authenticator.admin_users = {'gmoille'}

Now we can add the parameter for the google authentication. We will go in more detail in the next section

c.GoogleOAuthenticator.oauth_callback_url = 'https://<your_server>/hub/oauth_callback'
c.GoogleOAuthenticator.client_id = 'xxxxxxxxxxxxx'
c.GoogleOAuthenticator.client_secret = 'xxxxxxxxxxxxx'

Let’s say you actually want anybody within your <domain_name> to access it, even if you didn’t create their user, you can automatically create new local user based on the google oauth using (found from3):

c.Authenticator.add_user_cmd = ['adduser', '-q', '--gecos', '""', '--disabled-password', '--force-badname']

Setting Up Google API

Because JupyterLab spawns a not only notebook but the possibility to launch terminals, it is very important to set the security of your server correctly. We already setup the jupyterhub server to use google authentification but we need to setup the handshake and the calback correctly to be accepted on the google side.

To do so, log into goole with your <domain_name> account (i.e. for me @umd.edu) and go to the [Google API console]

Creating a daemon for JupyterHub

Instead of running jupyterhub manually, it is convenient to create a systemd daemon that can be launched at boot and stop/restart using systemctl

To do so, create the service file

sudo touch /etc/system/systemd/jupyterhub_server.service

Serving JupyterHub through a website with nginx

Extra: Visual Code Studio in JupyterHub