X2Go Bug report logs - #1315
loadchecker bad auto sleep time leads to faulty service

version graph

Package: x2gobroker-loadchecker; Maintainer for x2gobroker-loadchecker is X2Go Developers <x2go-dev@lists.x2go.org>; Source for x2gobroker-loadchecker is src:x2gobroker.

Reported by: Walid MOGHRABI <w.moghrabi@servicemagic.eu>

Date: Mon, 13 Aug 2018 10:45:03 UTC

Severity: normal

Tags: pending

Found in version 0.0.4.0-0~1038~ubuntu16.04.1

Fixed in version 0.0.4.0

Reply or subscribe to this bug.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to x2go-dev@lists.x2go.org, owner@bugs.x2go.org:
Bug#1315; Package x2gobroker-loadchecker. (Mon, 13 Aug 2018 10:45:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Walid MOGHRABI <w.moghrabi@servicemagic.eu>:
New Bug report received and forwarded. Copy sent to owner@bugs.x2go.org.

Your message specified a Severity: in the pseudo-header, but the severity value bug was not recognised. The default severity normal is being used instead. The recognised values are: critical, grave, important, normal, minor, wishlist.

(Mon, 13 Aug 2018 10:45:03 GMT) Full text and rfc822 format available.


Message #5 received at submit@bugs.x2go.org (full text, mbox):

From: Walid MOGHRABI <w.moghrabi@servicemagic.eu>
To: submit@bugs.x2go.org
Subject: loadchecker bad auto sleep time leads to faulty service
Date: Mon, 13 Aug 2018 12:43:45 +0200 (CEST)
package: x2gobroker-loadchecker
version: 0.0.4.0-0~1038~ubuntu16.04.1 
priority: bug 


I had a problem with broker loadbalancing not working anymore even though the loadchecker service seemed to work well.
Looking at the logs, I found an unusual "sleep time" for the loadchecker service then, after that, it seemed to never re-caclulate load on remote servers.

---
2018-08-08 08:43:40,379 - loadchecker - INFO - Executing agent command on remote host tce-server-14 (10.50.0.214): sh -c '/usr/lib/x2go/x2gobroker-agent foo checkload'
2018-08-08 08:43:40,812 - loadchecker - INFO - Broker agent answered: OK; loadavgXX:49; memAvail:21904; myMemAvail:22030; numCPU:16; typeCPU:2400; 
2018-08-08 08:43:40,812 - loadchecker - INFO - LoadChecker.loadchecker(): contacted remote broker agent for backend=inifile, profile_id=WIN-TRAVAUX, hostname=tce-server-14, new load factor is: 1716558
2018-08-08 08:43:40,813 - loadchecker - DEBUG - LoadChecker.loadchecker(): sleeping for 6.976744186046512secs before querying next server
2018-08-08 08:43:48,051 - loadchecker - INFO - Executing agent command on remote host tce-server-13 (10.50.0.213): sh -c '/usr/lib/x2go/x2gobroker-agent foo checkload'
2018-08-08 08:43:48,497 - loadchecker - INFO - Broker agent answered: OK; loadavgXX:126; memAvail:21117; myMemAvail:21242; numCPU:16; typeCPU:2400; 
2018-08-08 08:43:48,497 - loadchecker - INFO - LoadChecker.loadchecker(): contacted remote broker agent for backend=inifile, profile_id=WIN-TRAVAUX, hostname=tce-server-13, new load factor is: 643565
2018-08-08 08:43:48,498 - loadchecker - DEBUG - LoadChecker.loadchecker(): sleeping for 6.976744186046512secs before querying next server
2018-08-08 08:43:55,734 - loadchecker - INFO - Executing agent command on remote host tce-server-12 (10.50.0.212): sh -c '/usr/lib/x2go/x2gobroker-agent foo checkload'
2018-08-08 08:43:56,162 - loadchecker - INFO - Broker agent answered: OK; loadavgXX:52; memAvail:22260; myMemAvail:22386; numCPU:16; typeCPU:2400; 
2018-08-08 08:43:56,162 - loadchecker - INFO - LoadChecker.loadchecker(): contacted remote broker agent for backend=inifile, profile_id=WIN-TRAVAUX, hostname=tce-server-12, new load factor is: 1643815
2018-08-08 08:43:56,163 - loadchecker - DEBUG - LoadChecker.loadchecker(): sleeping for 6.976744186046512secs before querying next server
2018-08-08 08:44:03,404 - loadchecker - INFO - Executing agent command on remote host tce-server-11 (10.50.0.211): sh -c '/usr/lib/x2go/x2gobroker-agent foo checkload'
2018-08-08 08:44:03,850 - loadchecker - INFO - Broker agent answered: OK; loadavgXX:123; memAvail:21398; myMemAvail:21524; numCPU:16; typeCPU:2400; 
2018-08-08 08:44:03,851 - loadchecker - INFO - LoadChecker.loadchecker(): contacted remote broker agent for backend=inifile, profile_id=WIN-TRAVAUX, hostname=tce-server-11, new load factor is: 668035
2018-08-08 08:44:03,851 - loadchecker - DEBUG - LoadChecker.loadchecker(): sleeping for 6.976744186046512secs before querying next server
2018-08-08 08:44:11,093 - loadchecker - INFO - Executing agent command on remote host tce-server-10 (10.50.0.210): sh -c '/usr/lib/x2go/x2gobroker-agent foo checkload'
2018-08-08 08:44:11,532 - loadchecker - INFO - Broker agent answered: OK; loadavgXX:84; memAvail:22133; myMemAvail:22259; numCPU:16; typeCPU:2400; 
2018-08-08 08:44:11,532 - loadchecker - INFO - LoadChecker.loadchecker(): contacted remote broker agent for backend=inifile, profile_id=WIN-TRAVAUX, hostname=tce-server-10, new load factor is: 1011794
2018-08-08 08:44:11,533 - loadchecker - DEBUG - LoadChecker.loadchecker(): sleeping for 6.976744186046512secs before querying next server
2018-08-08 08:44:18,774 - loadchecker - INFO - Executing agent command on remote host tce-server-09 (10.50.0.209): sh -c '/usr/lib/x2go/x2gobroker-agent foo checkload'
2018-08-08 08:44:19,206 - loadchecker - INFO - Broker agent answered: OK; loadavgXX:93; memAvail:21758; myMemAvail:21883; numCPU:16; typeCPU:2400; 
2018-08-08 08:44:19,206 - loadchecker - INFO - LoadChecker.loadchecker(): contacted remote broker agent for backend=inifile, profile_id=WIN-TRAVAUX, hostname=tce-server-09, new load factor is: 898394
2018-08-08 08:44:19,206 - loadchecker - DEBUG - LoadChecker.loadchecker(): sleeping for 6.976744186046512secs before querying next server
2018-08-08 08:44:26,189 - loadchecker - DEBUG - LoadChecker.loadchecker(): performed 44 queries (failures: 0), sleeping for -6.976744186046517secs before starting next query cycle
2018-08-08 08:46:47,377 - loadchecker - DEBUG - LoadCheckServiceHandler.handle_read(): received load check query: backend=inifile, profile_id=TRAVAUX, hostname=
2018-08-08 08:46:47,377 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-11: 763244
2018-08-08 08:46:47,377 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-23: 1742811
2018-08-08 08:46:47,377 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-26: 507295
2018-08-08 08:46:47,378 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-18: 899328
2018-08-08 08:46:47,378 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-19: 932160
2018-08-08 08:46:47,378 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-09: 879966
2018-08-08 08:46:47,378 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-28: 834700
2018-08-08 08:46:47,378 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-17: 721955
2018-08-08 08:46:47,378 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-25: 715089
2018-08-08 08:46:47,379 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-29: 1195074
2018-08-08 08:46:47,379 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-20: 1431168
2018-08-08 08:46:47,379 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-27: 1196586
2018-08-08 08:46:47,379 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-31: 1972650
2018-08-08 08:46:47,379 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-13: 735827
2018-08-08 08:46:47,379 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-24: 780717
2018-08-08 08:46:47,379 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-10: 932487
2018-08-08 08:46:47,380 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-12: 1237871
2018-08-08 08:46:47,380 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-30: 1371001
2018-08-08 08:46:47,380 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-21: 587846
2018-08-08 08:46:47,380 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-14: 1616787
2018-08-08 08:46:47,380 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-15: 741288
2018-08-08 08:46:47,380 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-22: 924074
2018-08-08 08:46:54,245 - loadchecker - DEBUG - LoadCheckServiceHandler.handle_read(): received load check query: backend=inifile, profile_id=TRAVAUX, hostname=
2018-08-08 08:46:54,245 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-11: 763244
2018-08-08 08:46:54,245 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-23: 1742811
2018-08-08 08:46:54,246 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-26: 507295
2018-08-08 08:46:54,246 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-18: 899328
2018-08-08 08:46:54,246 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-19: 932160
2018-08-08 08:46:54,246 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-09: 879966
2018-08-08 08:46:54,246 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-28: 834700
2018-08-08 08:46:54,246 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-17: 721955
2018-08-08 08:46:54,246 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-25: 715089
2018-08-08 08:46:54,246 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-29: 1195074
2018-08-08 08:46:54,246 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-20: 1431168
2018-08-08 08:46:54,247 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-27: 1196586
2018-08-08 08:46:54,247 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-31: 1972650
2018-08-08 08:46:54,247 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-13: 735827
2018-08-08 08:46:54,247 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-24: 780717
2018-08-08 08:46:54,247 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-10: 932487
2018-08-08 08:46:54,247 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-12: 1237871
2018-08-08 08:46:54,247 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-30: 1371001
2018-08-08 08:46:54,247 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-21: 587846
2018-08-08 08:46:54,247 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-14: 1616787
2018-08-08 08:46:54,247 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-15: 741288
2018-08-08 08:46:54,248 - loadchecker - INFO - LoadCheckServiceHandler.handle_read(): load check result for backend=inifile, profile_id=TRAVAUX, hostname=tce-server-22: 924074
2018-08-08 08:47:21,453 - loadchecker - DEBUG - LoadCheckServiceHandler.handle_read(): received load check query: backend=inifile, profile_id=TRAVAUX, hostname=
---

As you can see, there is a strange line with a '-6.976744186046517secs' value for sleep time which seems to be interpreted as an infinite sleep time by the process and then, load is never recalculated :

---
2018-08-08 08:44:26,189 - loadchecker - DEBUG - LoadChecker.loadchecker(): performed 44 queries (failures: 0), sleeping for -6.976744186046517secs before starting next query cycle
---

I don't see any reason why this happens and I was unable to reproduce it manually, it is a rare case but it already happened more than once.
---
DISCLAIMER: This e-mail is private and confidential and may contain proprietary or legally privileged information. It is for the intended recipient only. If you have received this email in error, please notify the author by replying to it and then destroy it. If you are not the intended recipient you must not use, disclose, distribute, copy, print or rely on this e-mail or any attachment. Thank you


Information forwarded to x2go-dev@lists.x2go.org, X2Go Developers <x2go-dev@lists.x2go.org>:
Bug#1315; Package x2gobroker-loadchecker. (Thu, 13 Sep 2018 13:55:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Mike Gabriel <mike.gabriel@das-netzwerkteam.de>:
Extra info received and forwarded to list. Copy sent to X2Go Developers <x2go-dev@lists.x2go.org>. (Thu, 13 Sep 2018 13:55:03 GMT) Full text and rfc822 format available.

Message #10 received at 1315@bugs.x2go.org (full text, mbox):

From: Mike Gabriel <mike.gabriel@das-netzwerkteam.de>
To: 1315-submitter@bugs.x2go.org
Cc: control@bugs.x2go.org, 1315@bugs.x2go.org
Subject: X2Go issue (in src:x2gobroker) has been marked as pending for release
Date: Thu, 13 Sep 2018 15:50:32 +0200 (CEST)
tag #1315 pending
fixed #1315 0.0.4.0
thanks

Hello,

X2Go issue #1315 (src:x2gobroker) reported by you has been
fixed in X2Go Git. You can see the changelog below, and you can
check the diff of the fix at:

    http://code.x2go.org/gitweb?p=x2gobroker.git;a=commitdiff;h=533a37b

The issue will most likely be fixed in src:x2gobroker (0.0.4.0).

light+love
X2Go Git Admin (on behalf of the sender of this mail)

---
commit 533a37bc1650b1caf48e03a101df18e2c89b6f79
Author: Mike Gabriel <mike.gabriel@das-netzwerkteam.de>
Date:   Thu Sep 13 15:49:44 2018 +0200

    x2gobroker/loadchecker.py: Avoid rare cases where at the end of a load checking cycle a negative sleep time would have been calculated. (Fixes: #1315). Thanks to Walid Moghrabi for catching this.

diff --git a/debian/changelog b/debian/changelog
index cbe2446..6c8f0df 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -80,6 +80,9 @@ x2gobroker (0.0.4.0-0x2go1) UNRELEASED; urgency=medium
     - x2gobroker/loadchecker.py: Don't re-read the x2gobroker.conf during
       each cycle of the load checking loop. Rather read it on service startup
       and require a service restart when x2gobroker.conf has been changed.
+    - x2gobroker/loadchecker.py: Avoid rare cases where at the end of a load
+      checking cycle a negative sleep time would have been calculated.
+      (Fixes: #1315). Thanks to Walid Moghrabi for catching this.
   * debian/*:
     + Trigger Makefile's install target and install those files. Drop debhelper
       from-source-installation magic.


Added tag(s) pending. Request was from Mike Gabriel <mike.gabriel@das-netzwerkteam.de> to control@bugs.x2go.org. (Thu, 13 Sep 2018 13:55:04 GMT) Full text and rfc822 format available.

Marked as fixed in versions 0.0.4.0. Request was from Mike Gabriel <mike.gabriel@das-netzwerkteam.de> to control@bugs.x2go.org. (Thu, 13 Sep 2018 13:55:04 GMT) Full text and rfc822 format available.

Message sent on to Walid MOGHRABI <w.moghrabi@servicemagic.eu>:
Bug#1315. (Thu, 13 Sep 2018 13:55:05 GMT) Full text and rfc822 format available.

Send a report that this bug log contains spam.


X2Go Developers <owner@bugs.x2go.org>. Last modified: Thu Dec 13 13:07:03 2018; Machine Name: ymir.das-netzwerkteam.de

X2Go Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.