From unknown Fri Mar 29 10:11:03 2024 X-Loop: owner@bugs.x2go.org Subject: Bug#272: [X2Go-User] Session resume fails with AFS home directories Reply-To: Sebastian Flothow , 272@bugs.x2go.org Resent-From: Sebastian Flothow Resent-To: x2go-dev@lists.berlios.de Resent-CC: X2Go Developers X-Loop: owner@bugs.x2go.org Resent-Date: Mon, 30 Sep 2013 15:33:02 +0000 Resent-Message-ID: Resent-Sender: owner@bugs.x2go.org X-X2Go-PR-Message: followup 272 X-X2Go-PR-Package: x2goserver X-X2Go-PR-Keywords: Received: via spool by 272-submit@bugs.x2go.org id=B272.138055447932418 (code B ref 272); Mon, 30 Sep 2013 15:33:02 +0000 Received: (at 272) by bugs.x2go.org; 30 Sep 2013 15:21:19 +0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on ymir.das-netzwerkteam.de X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.2 Received: from hermes.gip.com (hermes.gip.com [213.139.134.71]) by ymir (Postfix) with ESMTP id C4AE85DB11 for <272@bugs.x2go.org>; Mon, 30 Sep 2013 17:21:18 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by hermes.gip.com (Postfix) with ESMTP id 5449A17E805E; Mon, 30 Sep 2013 17:21:18 +0200 (CEST) Received: from hermes.gip.com (localhost [127.0.0.1]) by localhost (AvMailGate-3.2.1.26) id 23707-VQh6xZ; Mon, 30 Sep 2013 15:21:18 -0000 Received: from [10.0.9.56] (devlin056.gip.local [10.0.9.56]) by hermes.gip.com (Postfix) with ESMTPSA id 4BDD017E805E; Mon, 30 Sep 2013 17:21:18 +0200 (CEST) Message-ID: <524996ED.2010401@gip.com> Date: Mon, 30 Sep 2013 17:21:17 +0200 From: Sebastian Flothow User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:17.0) Gecko/20130911 Thunderbird/17.0.9 MIME-Version: 1.0 To: Mike Gabriel , 272@bugs.x2go.org References: <523712FB.2060200@gip.com> <20130918232438.69352mqw8ozl1a1i@mail.das-netzwerkteam.de> In-Reply-To: <20130918232438.69352mqw8ozl1a1i@mail.das-netzwerkteam.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-AntiVirus: checked by Avira MailGate (version: 3.2.1.26; AVE: 8.2.12.124; VDF: 7.11.105.64; host: hermes); id=23707-VQh6xZ Hi, Am 18.09.2013 23:24, schrieb Mike Gabriel: > Does a session simply not resume (with an x2goagent still being present > for this session)? Or does the x2goagent crash somewhere on the run > (i.e. when the session is suspended and the AFS home freezes some time > later)? I did a quick test (note that I removed retain_after_close from the PAM config again, so that I can create broken sessions quickly without waiting for AFS token expiry). Right after starting a new session, things look like this: giplin101:~# x2golistsessions_root 26521|flothow-50-1380548308_stDXFCE_dp32|50|giplin101|R|2013-09-30T15:38:28|da9be4183a9c7d711f325742963c691e|10.0.0.105|30001|30002|2013-09-30T15:38:31|flothow|112|30003| giplin101:~# ps 26521 PID TTY STAT TIME COMMAND 26521 ? S 0:00 /usr/lib/nx/../x2go/bin/x2goagent -extension XFIXES -extension GLX -nolisten tcp -D -auth /afs/gip.l Then, after suspending the session: giplin101:~# x2golistsessions_root 26521|flothow-50-1380548308_stDXFCE_dp32|50|giplin101|S|2013-09-30T15:38:28|da9be4183a9c7d711f325742963c691e|10.0.0.105|30001|30002|2013-09-30T15:41:32|flothow|186|30003| giplin101:~# ps 26521 PID TTY STAT TIME COMMAND 26521 ? S 0:00 /usr/lib/nx/../x2go/bin/x2goagent -extension XFIXES -extension GLX -nolisten tcp -D -auth /afs/gip.l After attempting to resume it: giplin101:~# x2golistsessions_root 26521|flothow-50-1380548308_stDXFCE_dp32|50|giplin101|R|2013-09-30T15:38:28|da9be4183a9c7d711f325742963c691e|10.0.0.105|30001|30002|2013-09-30T15:43:20|flothow|315|30003| giplin101:~# ps 26521 PID TTY STAT TIME COMMAND 26521 ? S 0:00 /usr/lib/nx/../x2go/bin/x2goagent -extension XFIXES -extension GLX -nolisten tcp -D -auth /afs/gip.l It is still in this state now, more than half an hour later. > If you look at > the script /usr/bin/x2goresume-session, can you spot anything that might > fail on AFS? I already looked at this script a few weeks ago and added a bunch of debug statements which log various things to /var/log/x2godebug. When the script executes, there is a valid AFS token, $SESSION_DIR and ${SESSION_DIR}/options are readable, and the script completes successfully. However, I think that this is not meaningful. What happens is presumably this: When first logging in (before an X2Go session exists), a new SSH session is created, which I'll refer to as the first SSH session. This session obtains a Kerberos ticket and an AFS token through PAM, and then spawns an X2Go sessions which inherits these. The Kerberos ticket is stored in a file pointed to by $KRB5CCNAME, while the AFS token is tied to the PAG (Process Authentication Group). When suspending the X2Go session, this first SSH session is terminated. Depending on the PAM configuration, the ticket and token are either removed immediately, or expire some time later. Now, when attempting to resume the X2Go session, a new, second SSH session is created. This session again obtains a ticket and a token, and it seems to be this session in which x2goresume-session is executed; however, this ticket/token is in a different file/PAG (resp.) than those from the first session, so the X2Go session can't use them. After figuring this out, I remembered that pam_afs_session recognizes the parameter nopag, which inhibits PAG creation. Absent a PAG, AFS tokens are tied to user IDs instead, and indeed, when this option is set, sessions can be resumed even after their initial token expired - without PAGs, the new token from the second session propagates to the first session, since the user ID is identical. After resuming, the X2Go session still doesn't have a valid Kerberos ticket (because there are still two different ticket files), but it does have an AFS token, which is all that matters for filesystem access. Obtaining a new Kerberos ticket can then be done manually if necessary. However, I'm a bit wary of using nopag in a production environment, because the man page also warns: "Be careful when using this option, since it means that the user will inherit a PAG from the process managing the login. If sshd, for instance, is started in a PAG, every user who logs in via ssh will be put in the same PAG and will share tokens if this option is used." To fix this so that it works without nopag, we'd need to move an AFS token from one PAG to another. I'm not aware of any way to do this directly, but it might be possible to copy the Kerberos ticket from the new ticket file to the old one, and then call aklog within the old session before attempting any file system access. - Sebastian