From unknown Thu Mar 28 18:35:58 2024 X-Loop: owner@bugs.x2go.org Subject: Bug#1317: extended checks for loadchecker (or broker ... dunno) Reply-To: Walid MOGHRABI , 1317@bugs.x2go.org Resent-From: Walid MOGHRABI Resent-To: x2go-dev@lists.x2go.org Resent-CC: X2Go Developers X-Loop: owner@bugs.x2go.org Resent-Date: Mon, 13 Aug 2018 10:55:01 +0000 Resent-Message-ID: Resent-Sender: owner@bugs.x2go.org X-X2Go-PR-Message: report 1317 X-X2Go-PR-Package: x2gobroker X-X2Go-PR-Keywords: Received: via spool by submit@bugs.x2go.org id=B.15341576824555 (code B); Mon, 13 Aug 2018 10:55:01 +0000 Received: (at submit) by bugs.x2go.org; 13 Aug 2018 10:54:42 +0000 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on ymir.das-netzwerkteam.de X-Spam-Level: X-Spam-Status: No, score=0.8 required=3.0 tests=BAYES_50 autolearn=ham autolearn_force=no version=3.4.1 Received: from localhost (localhost [127.0.0.1]) by ymir.das-netzwerkteam.de (Postfix) with ESMTP id 13B245DAEA for ; Mon, 13 Aug 2018 12:54:41 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at ymir.das-netzwerkteam.de Received: from ymir.das-netzwerkteam.de ([127.0.0.1]) by localhost (ymir.das-netzwerkteam.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id u1BGb2JBW6ey for ; Mon, 13 Aug 2018 12:54:35 +0200 (CEST) Received: from zm-01.servicemagic.eu (zm-01.servicemagic.eu [176.31.236.17]) by ymir.das-netzwerkteam.de (Postfix) with ESMTPS id 434D05DAE9 for ; Mon, 13 Aug 2018 12:54:35 +0200 (CEST) Received: from localhost (localhost.localdomain [127.0.0.1]) by zm-01.servicemagic.eu (Postfix) with ESMTP id 1759C80A6CB54 for ; Mon, 13 Aug 2018 12:54:35 +0200 (CEST) X-Amavis-Modified: Mail body modified (using disclaimer) - zm-01.servicemagic.eu X-Virus-Scanned: amavisd-new at servicemagic.eu Received: from zm-01.servicemagic.eu ([127.0.0.1]) by localhost (zm-01.servicemagic.eu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Dn73s+tTI6E4 for ; Mon, 13 Aug 2018 12:54:34 +0200 (CEST) Received: from zm-01.servicemagic.eu (localhost.localdomain [127.0.0.1]) by zm-01.servicemagic.eu (Postfix) with ESMTP id 40D94807CB28D for ; Mon, 13 Aug 2018 12:54:34 +0200 (CEST) Date: Mon, 13 Aug 2018 12:54:34 +0200 (CEST) From: Walid MOGHRABI To: submit@bugs.x2go.org Message-ID: <1182676222.4003280.1534157674248.JavaMail.root@servicemagic.eu> In-Reply-To: <1697772140.4003116.1534157525185.JavaMail.root@servicemagic.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [195.200.167.70] X-Mailer: Zimbra 7.2.0_GA_2669 (ZimbraWebClient - GC68 (Linux)/7.2.0_GA_2669) package: x2gobroker version: 0.0.4.0-0~1038~ubuntu16.04.1 priority: enhancement We encountered another corner case very annoying. The whole broker stack does its work quite well when everything works as expected (which is mostly the case). A failing server for example is perfectly handled, loadchecker simply disable this server from the list and thus, broker sends users to remaining live servers. Problems are beginning when your servers misbehaves. Loadchecker mainly checks : * liveness ("ping") * ssh acess * load/memory values This is okay but not enough to be sure that a server can handle incoming connections, for example, for our needs, we should make sure that : * a user can authenticate on a server (Active Directory authentication through PAM/Winbind) * some mounts are correct (/home folders, other user mounts or shares) * some networks are available * some mandatory services are running * ... In our recent problems, for unknown reasons, some servers were having troubles joining the Active Directory domain and thus, user auth was failing. On the broker/loadchecker side, the server is perfectly working and this is even one of the best performing server since it is empty so it allways tries to redirect incoming connections to this server and since users can't auth, it fails and blocks every new connections. With extended checks, this server would have been considered offline from the loadchecker point of view and thus, it would have just been out of the load balancing. Since those checks are pretty "specific", it would be great to have some kind of "extended check" feature where a directory "extra-check.d" folder would exist on the x2gobroker setup in which we could drop some scripts that would be executed by the x2gobroker user on the remote servers with just an "ok" or "ko" value. As soon as you get a "ko" value, the server would be considered unavailable and then removed from load balancing until the next "ok" check. Script could be of any language with supported intepreter installed on the server (could be bash, perl, python or anything, you'll just need the interpreted to be installed but this is the administrator responsibility). --- DISCLAIMER: This e-mail is private and confidential and may contain proprietary or legally privileged information. It is for the intended recipient only. If you have received this email in error, please notify the author by replying to it and then destroy it. If you are not the intended recipient you must not use, disclose, distribute, copy, print or rely on this e-mail or any attachment. Thank you