Setting up jupyterhub with anaconda + sudospawn

Jupyter is a great system of "literary programming" in which you can write code in python or several other high-level languages, directly view the output of text or graphics.



To do collaborative work, to use this tool to teach, it is useful to have a system that allows, also through a web interface, to launch the Jupiter notepad server, and that the servers run in the space of the user who does login.





Jupyterhub is a python package that allows different users to a server and launch jupyter notebook system sessions.


By default, jupyterhub is configured to run as root. However, if we want to expose this service to a public network, we should run this service as a user with limited privileges. This is achieved through the sudo-package. This part of the guide 
https://github.com/jupyterhub/jupyterhub/wiki/Using-sudo-to-run-JupyterHub-without-root-privileges


Steps:


1.  Install anaconda. It can be downloaded from  https://www.continuum.io/downloads. Notice that  jupyterhub requires python 3.

2. Install jupyter (with  anaconda, it comes out-the-box)

3. Install jupyterhub
$ conda install --channel https://conda.anaconda.org/conda-forge jupyterhub
4. setup a user jupyterhub, with a home folder containing
    * certifies
    * the datebase
    * a folder to share among all the users.
 
$ mkdir [JUPYTER_HOMEPATH] 
$ mkdir [JUPYTER_HOMEPATH]/notebooks
$ mkdir [JUPYTER_HOMEPATH]/certs
$ useradd  -d [JUPYTER_HOMEPATH]  -s ""  jupyterhub
$ chown -R jupyterhub.jupyterhub [JUPYTER_HOMEPATH]
$ cd [JUPYTER_HOMEPATH]/certs 
$ openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout server.key -out server.crt

 

5. Add the user jupyterhub to the group shadow
$ usermod -a -G shadow jupyterhub


6. Install sudospawner. We do that throught the anaconda's package manager.
    $ conda install --channel https://conda.anaconda.org/conda-forge sudospawner
    $ ln -s [ANACONDA_PATH]/bin/sudospawner /usr/local/bin/

7. Add to the  sudoers file (by means of  the visudo command)
$ visudo
adding at the end of the file


Cmnd_Alias JUPYTER_CMD = /usr/local/bin/sudospawner
jupyterhub ALL=(%jupyterhub) NOPASSWD:JUPYTER_CMD 




Notice that it is important that the commands
* sudospawner
* configurable-http-proxy
* jupyterhub-singleuser
* node

 are inside /usr/local/bin and not on the  anaconda's bin folder, in order to sudo recognizes it as a valid command. To do that, we need to create the proper links from the [anaconda]/bin files to /usr/local/bin.




8. Add to the group jupyterhub all the users that should be able to login in jupyterhub.
$ usermod -a -G jupyterhub [user1] 
$ usermod -a -G jupyterhub [user2]
9. Create   /etc/jupyterhub/, generate the configuration file
$ mkdir /etc/jupyterhub
$ cd /etc/jupyterhub
$ [ANACONDA_PATH]/bin/jupyterhub --generate-config   

$ chown -R jupyterhub.jupyterhub /etc/jupyterhub
$ nano jupyterhub_config.py
and establish the following parameters:

c.JupyterHub.admin_access =  True
c.JupyterHub.cookie_secret_file = '
[JUPYTER_HOMEPATH]/jupyterhub_cookie_secret'
c.JupyterHub.ip = '
[server IP]'
c.JupyterHub.port = 8000
c.JupyterHub.hub_port = 8082
c.JupyterHub.ssl_cert = '[JUPYTER_HOMEPATH]/certs/server.crt'
c.JupyterHub.ssl_key = '
[JUPYTER_HOMEPATH]/certs/server.key'
c.JupyterHub.spawner_class = 'sudospawner.SudoSpawner'

c.Spawner.notebook_dir = '~/jupyter-notebooks'
c.Spawner.ip = '[server IP]'
c.Authenticator.admin_users = set({"[Admin username]"})
c.PAMAuthenticator.service = 'login

10. Create the service  file /etc/init.d/jupyterhub. The following file is an example that works on Ubuntu

#! /bin/sh
### BEGIN INIT INFO
# Provides:          jupyterhub
# Required-Start:    $remote_fs $syslog
# Required-Stop:     $remote_fs $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Start jupyterhub
# Description:       This file should be used to construct scripts to be
#                    placed in /etc/init.d.
### END INIT INFO

# Author: Alisue
#
# Please remove the "Author" lines above and replace them
# with your own name if you copy and modify this script.

# Do NOT "set -e"

# Source function library.
#For centos
#. /etc/rc.d/init.d/functions


# PATH should only include /usr/* if it runs after the mountnfs.sh script
DESC="Multi-user server for Jupyter notebooks"
NAME=jupyterhub
RUNAS_USER="jupyterhub"
DAEMON=/opt/anaconda3/bin/jupyterhub
WORKDIRECTORY=[JUPYTER_HOME]
DAEMON_ARGS=" -f /etc/jupyterhub/jupyterhub_config.py --log-level=DEBUG"
PIDFILE=/var/run/$NAME.pid
SCRIPTNAME=/etc/init.d/$NAME

# Exit if the package is not installed
[ -x "$DAEMON" ] || exit 0

# Read configuration variable file if it is present
[ -r /etc/default/$NAME ] && . /etc/default/$NAME

# Load the VERBOSE setting and other rcS variables
. /lib/init/vars.sh

# Define LSB log_* functions.
# Depend on lsb-base (>= 3.2-14) to ensure that this file is present
# and status_of_proc is working.
. /lib/lsb/init-functions

#
# Function that starts the daemon/service
#
do_start()
{
    # Return
    #   0 if daemon has been started
    #   1 if daemon was already running
    #   2 if daemon could not be started
    start-stop-daemon -v --start -d $WORKDIRECTORY -c $RUNAS_USER  --pidfile $PIDFILE --exec $DAEMON --test  #|| return 1
    RETVAL="$?"
    [ "$RETVAL" = 1 ] && return 1
    start-stop-daemon -v --start -d $WORKDIRECTORY --background -c $RUNAS_USER --make-pidfile --pidfile $PIDFILE \
    --exec $DAEMON --  $DAEMON_ARGS || return 2
    # Add code here, if necessary, that waits for the process to be ready
    # to handle requests from services started subsequently which depend
    # on this one.  As a last resort, sleep for some time.
    return 0
}

#
# Function that stops the daemon/service
#
do_stop()
{
    # Return
    #   0 if daemon has been stopped
    #   1 if daemon was already stopped
    #   2 if daemon could not be stopped
    #   other if a failure occurred
    start-stop-daemon -v --stop  --retry=TERM/30/KILL/5  --user $RUNAS_USER
    RETVAL="$?"
    [ "$RETVAL" = 2 ] && return 2
    # Wait for children to finish too if this is a daemon that forks
    # and if the daemon is only ever run from this initscript.
    # If the above conditions are not satisfied then add some other code
    # that waits for the process to drop all resources that could be
    # needed by services started subsequently.  A last resort is to
    # sleep for some time.
   
    start-stop-daemon -v --stop  --oknodo --retry=0/30/KILL/5 --user $RUNAS_USER
    [ "$?" = 2 ] && return 2
    # Many daemons don't delete their pidfiles when they exit.
    rm -f $PIDFILE
    return "$RETVAL"
}

#
# Function that sends a SIGHUP to the daemon/service
#
do_reload() {
    #
    # If the daemon can reload its configuration without
    # restarting (for example, when it is sent a SIGHUP),
    # then implement that here.
    #
     start-stop-daemon --stop --signal 1 --quiet --pidfile $PIDFILE --name $NAME
    do_stop  || return "$?"
    do_start || return "$?"
    return 0
}

case "$1" in
  start)
    [ "$VERBOSE" != no ] && log_daemon_msg "Starting $DESC" "$NAME"
    do_start
    case "$?" in
        0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
        2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
    esac
    return 0
    ;;
  stop)
    [ "$VERBOSE" != no ] && log_daemon_msg "Stopping $DESC" "$NAME"
    do_stop
    RET=$?
    case "RET" in
        0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
        2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
        esac
    ;;
  status)
    status_of_proc "$DAEMON" "$NAME" && exit 0 || exit $?
    ;;
  #reload|force-reload)
    #
    # If do_reload() is not implemented then leave this commented out
    # and leave 'force-reload' as an alias for 'restart'.
    #
    #log_daemon_msg "Reloading $DESC" "$NAME"
    #do_reload
    #log_end_msg $?
    #;;
  restart|force-reload)
    #
    # If the "reload" option is implemented then remove the
    # 'force-reload' alias
    #
    log_daemon_msg "Restarting $DESC" "$NAME"
    do_stop
    case "$?" in
          0|1)
        do_start
        case "$?" in
            0) log_end_msg 0 ;;
            1) log_end_msg 1 ;; # Old process is still running
            *) log_end_msg 1 ;; # Failed to start
            esac
        ;;
          *)
        # Failed to stop
        log_end_msg 1
        ;;
        esac
    ;;
  *)
    echo "Usage: $SCRIPTNAME {start|stop|status|restart|force-reload}"
    exit 3
    ;;
esac


11. Add  jupyterhub to the list of  services.
$ for A in  `echo 2 3 4 5`; do ln -s  /etc/init.d/jupyterhub "/etc/rc$A.d/S99jupyterhub"; done
$ for A in  `echo 0 6`; do ln -s  /etc/init.d/jupyterhub "/etc/rc$A.d/K01jupyterhub"; done

Configurando Jupyterhub con anaconda + sudospawn

Jupyter es un estupendo sistema de "programación literaria" en el que podemos escribir código en python o en varios otros lenguajes de alto nivel, ver directamente la salida (de texto o gráfica)  junto con código latex a través de una interface web.

Para hacer trabajo colaborativo, o para usar esta herramienta para dar clases, es útil contar con un sistema que permita, también a través de una interface web, lanzar el servidor de notebooks de jupyter, y que los servidores corran en el espacio del usuario que hace login.
 

Jupyterhub es un paquete de python que permite a diferentes usuarios loguearse contra un servidor e iniciar sesiones del sistema de notebooks jupyter.

Por defecto, jupyterhub está configurado para correr como root. Sin embargo, si queremos exponer este servicio a una red pública, conviene correr este servicio como un usuario con privilegios limitados. Esto se logra mediante el paquete sudospawn. Esta parte la tomé de la guía (en inglés) https://github.com/jupyterhub/jupyterhub/wiki/Using-sudo-to-run-JupyterHub-without-root-privileges


Pasos a seguir:


1.  Instalar anaconda. Se descarga en forma libre desde https://www.continuum.io/downloads. Notar que jupyterhub requiere la versión 3. de python.

2. Instalar jupyter (con anaconda, ya viene instalado por defecto)

3. Instalar jupyterhub
$ conda install --channel https://conda.anaconda.org/conda-forge jupyterhub
4. configurar un usuario jupyterhub, con una carpeta home que contenga
    + certificados
    + base de datos
    + carpeta compartida

$ mkdir [JUPYTER_HOMEPATH] 
$ mkdir [JUPYTER_HOMEPATH]/notebooks
$ mkdir [JUPYTER_HOMEPATH]/certs
$ useradd  -d [JUPYTER_HOMEPATH]  -s ""  jupyterhub
$ chown -R jupyterhub.jupyterhub [JUPYTER_HOMEPATH]
$ cd [JUPYTER_HOMEPATH]/certs 
$ openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout server.key -out server.crt
$ chown -R jupyterhub.root server.*
 

 


5. Agregar el usuario jupyterhub al grupo shadow
$ usermod -a -G shadow jupyterhub


6. Instalar sudospawner. Esto lo hacemos vía el sistema de paquetes de anaconda.
    $ conda install --channel https://conda.anaconda.org/conda-forge sudospawner
    $ ln -s [ANACONDA_PATH]/bin/sudospawner /usr/local/bin/
 
 

7. Agregar al archivo sudoers (mediante el comando visudo)
$ visudo
    y agregar al final del archivo las lineas


Cmnd_Alias JUPYTER_CMD = /usr/local/bin/sudospawner
jupyterhub ALL=(%jupyterhub) NOPASSWD:JUPYTER_CMD 




Notar que es importante que los comandos
* sudospawner
* configurable-http-proxy
* jupyterhub-singleuser
* node

estén  en /usr/local/bin y no en la carpeta de anacondapara que sudo lo reconozca como un comando válido.
Para esto, necesitamos crear en /usr/local/bin los correspondientes links a los archivos en [anaconda]/bin.

8. Agregar al grupo jupyterhub todos los usuarios que pueden usar jupyterhub.
$ usermod -a -G jupyterhub [user1] 
$ usermod -a -G jupyterhub [user2]
9. Crear una carpeta  /etc/jupyterhub/, generar un archivo de configuración  y setear
$ mkdir /etc/jupyterhub
$ cd /etc/jupyterhub
$ [ANACONDA_PATH]/bin/jupyterhub --generate-config   
$ nano jupyterhub_config.py
y modificar los siguientes parámetros

c.JupyterHub.admin_access =  True
c.JupyterHub.cookie_secret_file = '
[JUPYTER HOMEPATH]/jupyterhub_cookie_secret'
c.JupyterHub.ip = '[IP del servidor]'
c.JupyterHub.port = 8000
c.JupyterHub.hub_port = 8082
c.JupyterHub.ssl_cert = '[JUPYTER HOMEPATH]/certs/server.crt'
c.JupyterHub.ssl_key = '
[JUPYTER HOMEPATH]/certs/server.key'
c.JupyterHub.spawner_class = 'sudospawner.SudoSpawner'

c.Spawner.notebook_dir = '~/jupyter-notebooks'
c.Spawner.ip = '[IP del servidor]'
c.Authenticator.admin_users = set({"[Usuario Administrador]"})
c.PAMAuthenticator.service = 'login'



10. Crear un archivo  de arranque  /etc/init.d/jupyterhub. El siguiente archivo es un ejemplo que funciona en Ubuntu 

#! /bin/sh
### BEGIN INIT INFO
# Provides:          jupyterhub
# Required-Start:    $remote_fs $syslog
# Required-Stop:     $remote_fs $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Start jupyterhub
# Description:       This file should be used to construct scripts to be
#                    placed in /etc/init.d.
### END INIT INFO

# Author: Alisue
#
# Please remove the "Author" lines above and replace them
# with your own name if you copy and modify this script.

# Do NOT "set -e"

# Source function library.
#For centos
#. /etc/rc.d/init.d/functions


# PATH should only include /usr/* if it runs after the mountnfs.sh script
DESC="Multi-user server for Jupyter notebooks"
NAME=jupyterhub
RUNAS_USER="jupyterhub"
DAEMON=/opt/anaconda3/bin/jupyterhub
WORKDIRECTORY=[JUPYTER_HOME]
DAEMON_ARGS=" -f /etc/jupyterhub/jupyterhub_config.py --log-level=DEBUG"
PIDFILE=/var/run/$NAME.pid
SCRIPTNAME=/etc/init.d/$NAME

# Exit if the package is not installed
[ -x "$DAEMON" ] || exit 0

# Read configuration variable file if it is present
[ -r /etc/default/$NAME ] && . /etc/default/$NAME

# Load the VERBOSE setting and other rcS variables
. /lib/init/vars.sh

# Define LSB log_* functions.
# Depend on lsb-base (>= 3.2-14) to ensure that this file is present
# and status_of_proc is working.
. /lib/lsb/init-functions

#
# Function that starts the daemon/service
#
do_start()
{
    # Return
    #   0 if daemon has been started
    #   1 if daemon was already running
    #   2 if daemon could not be started
    start-stop-daemon -v --start -d $WORKDIRECTORY -c $RUNAS_USER  --pidfile $PIDFILE --exec $DAEMON --test  #|| return 1
    RETVAL="$?"
    [ "$RETVAL" = 1 ] && return 1
    start-stop-daemon -v --start -d $WORKDIRECTORY --background -c $RUNAS_USER --make-pidfile --pidfile $PIDFILE \
    --exec $DAEMON --  $DAEMON_ARGS || return 2
    # Add code here, if necessary, that waits for the process to be ready
    # to handle requests from services started subsequently which depend
    # on this one.  As a last resort, sleep for some time.
    return 0
}

#
# Function that stops the daemon/service
#
do_stop()
{
    # Return
    #   0 if daemon has been stopped
    #   1 if daemon was already stopped
    #   2 if daemon could not be stopped
    #   other if a failure occurred
    start-stop-daemon -v --stop  --retry=TERM/30/KILL/5  --user $RUNAS_USER
    RETVAL="$?"
    [ "$RETVAL" = 2 ] && return 2
    # Wait for children to finish too if this is a daemon that forks
    # and if the daemon is only ever run from this initscript.
    # If the above conditions are not satisfied then add some other code
    # that waits for the process to drop all resources that could be
    # needed by services started subsequently.  A last resort is to
    # sleep for some time.
   
    start-stop-daemon -v --stop  --oknodo --retry=0/30/KILL/5 --user $RUNAS_USER
    [ "$?" = 2 ] && return 2
    # Many daemons don't delete their pidfiles when they exit.
    rm -f $PIDFILE
    return "$RETVAL"
}

#
# Function that sends a SIGHUP to the daemon/service
#
do_reload() {
    #
    # If the daemon can reload its configuration without
    # restarting (for example, when it is sent a SIGHUP),
    # then implement that here.
    #
     start-stop-daemon --stop --signal 1 --quiet --pidfile $PIDFILE --name $NAME
    do_stop  || return "$?"
    do_start || return "$?"
    return 0
}

case "$1" in
  start)
    [ "$VERBOSE" != no ] && log_daemon_msg "Starting $DESC" "$NAME"
    do_start
    case "$?" in
        0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
        2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
    esac
    return 0
    ;;
  stop)
    [ "$VERBOSE" != no ] && log_daemon_msg "Stopping $DESC" "$NAME"
    do_stop
    RET=$?
    case "RET" in
        0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
        2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
        esac
    ;;
  status)
    status_of_proc "$DAEMON" "$NAME" && exit 0 || exit $?
    ;;
  #reload|force-reload)
    #
    # If do_reload() is not implemented then leave this commented out
    # and leave 'force-reload' as an alias for 'restart'.
    #
    #log_daemon_msg "Reloading $DESC" "$NAME"
    #do_reload
    #log_end_msg $?
    #;;
  restart|force-reload)
    #
    # If the "reload" option is implemented then remove the
    # 'force-reload' alias
    #
    log_daemon_msg "Restarting $DESC" "$NAME"
    do_stop
    case "$?" in
          0|1)
        do_start
        case "$?" in
            0) log_end_msg 0 ;;
            1) log_end_msg 1 ;; # Old process is still running
            *) log_end_msg 1 ;; # Failed to start
            esac
        ;;
          *)
        # Failed to stop
        log_end_msg 1
        ;;
        esac
    ;;
  *)
    echo "Usage: $SCRIPTNAME {start|stop|status|restart|force-reload}" >&2
    exit 3
    ;;
esac


11. Agregar jupyterhub a la lista de servicios de arranque.
$ for A in  `echo 2 3 4 5`; do ln -s  /etc/init.d/jupyterhub "/etc/rc$A.d/S99jupyterhub"; done
$ for A in  `echo 0 6`; do ln -s  /etc/init.d/jupyterhub "/etc/rc$A.d/K01jupyterhub"; done