Sie sind nicht angemeldet.

Lieber Besucher, herzlich willkommen bei: GentooForum.de. Falls dies Ihr erster Besuch auf dieser Seite ist, lesen Sie sich bitte die Hilfe durch. Dort wird Ihnen die Bedienung dieser Seite näher erläutert. Darüber hinaus sollten Sie sich registrieren, um alle Funktionen dieser Seite nutzen zu können. Benutzen Sie das Registrierungsformular, um sich zu registrieren oder informieren Sie sich ausführlich über den Registrierungsvorgang. Falls Sie sich bereits zu einem früheren Zeitpunkt registriert haben, können Sie sich hier anmelden.

1

07.01.2016, 22:24

Nagios spinnt seit ein paar Tagen herum

Hallo Leute,

also seit gut 2 Wochen spinnt mein Nagios4 herum, aber nicht nur das anscheinend spinnt es nur weil der ganze Server zickt. DNSauflösungen dauern, Login dauert. Nagios lässt alle CPUkerne auf 100% laufen. Logs sehen dann so aus:

Wenn Nagios auf 100% läuft, dann sieht das im journalctl so aus:

Quellcode

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
an 05 18:52:54 itmgmt nagios[180]: wproc:   early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
Jan 05 18:52:54 itmgmt nagios[180]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 05 18:52:54 itmgmt nagios[180]: wproc: Core Worker 190: job 257638800 with pid 4269 reaped at timeout. timeouts=122; started=18538
Jan 05 18:52:54 itmgmt nagios[180]: wproc: Core Worker 190: job 18536 (pid=4276) timed out. Killing it
Jan 05 18:52:54 itmgmt nagios[180]: wproc: SERVICE PERFDATA job 18536 from worker Core Worker 190 is a non-check helper but exited with return code 3
Jan 05 18:52:54 itmgmt nagios[180]: wproc:   early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
Jan 05 18:52:54 itmgmt nagios[180]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 05 18:52:54 itmgmt nagios[180]: wproc: Core Worker 190: job 257656160 with pid 4276 reaped at timeout. timeouts=123; started=18538
Jan 05 18:52:54 itmgmt nagios[180]: wproc: Core Worker 190: kill(-4237, SIGKILL) failed: Operation not permitted
Jan 05 18:52:54 itmgmt nagios[180]: wproc: Core Worker 190: job 18533 (pid=4237): Dormant child reaped
Jan 05 18:52:59 itmgmt nagios[180]: wproc: Core Worker 192: job 18541 (pid=4347) timed out. Killing it
Jan 05 18:52:59 itmgmt nagios[180]: wproc: Core Worker 192: kill(-4347, SIGKILL) failed: Operation not permitted
Jan 05 18:52:59 itmgmt nagios[180]: wproc: Core Worker 188: job 18541 (pid=4351) timed out. Killing it
Jan 05 18:52:59 itmgmt nagios[180]: wproc: Core Worker 188: kill(-4351, SIGKILL) failed: Operation not permitted
Jan 05 18:52:59 itmgmt nagios[180]: wproc: Core Worker 191: job 18542 (pid=4360) timed out. Killing it
Jan 05 18:52:59 itmgmt nagios[180]: wproc: Core Worker 191: kill(-4360, SIGKILL) failed: Operation not permitted
Jan 05 18:52:59 itmgmt nagios[180]: wproc: Core Worker 187: job 18542 (pid=4352) timed out. Killing it
Jan 05 18:52:59 itmgmt nagios[180]: wproc: Core Worker 187: kill(-4352, SIGKILL) failed: Operation not permitted
Jan 05 18:52:59 itmgmt nagios[180]: wproc: Core Worker 190: job 18542 (pid=4359) timed out. Killing it
Jan 05 18:52:59 itmgmt nagios[180]: wproc: Core Worker 190: kill(-4359, SIGKILL) failed: Operation not permitted
Jan 05 18:52:59 itmgmt nagios[180]: wproc: Core Worker 189: job 18542 (pid=4367) timed out. Killing it
Jan 05 18:52:59 itmgmt nagios[180]: wproc: Core Worker 189: kill(-4367, SIGKILL) failed: Operation not permitted
Jan 05 18:53:19 itmgmt nagios[180]: wproc: SERVICE PERFDATA job 18541 from worker Core Worker 192 timed out after 25.13s
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 192: job 18542 (pid=4361) timed out. Killing it
Jan 05 18:53:19 itmgmt nagios[180]: wproc: HOST PERFDATA job 18542 from worker Core Worker 192 is a non-check helper but exited with return code 3
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 192: job 0 with pid 4361 reaped at timeout. timeouts=116; started=18546
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 192: job 18543 (pid=4372) timed out. Killing it
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 192: kill(-4372, SIGKILL) failed: Operation not permitted
Jan 05 18:53:19 itmgmt nagios[180]: wproc: SERVICE PERFDATA job 18543 from worker Core Worker 192 timed out after 24.94s
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 192: job 18544 (pid=4379) timed out. Killing it
Jan 05 18:53:19 itmgmt nagios[180]: wproc: SERVICE PERFDATA job 18544 from worker Core Worker 192 is a non-check helper but exited with return code 3
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 192: job 0 with pid 4379 reaped at timeout. timeouts=117; started=18546
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 192: job 18545 (pid=4387) timed out. Killing it
Jan 05 18:53:19 itmgmt nagios[180]: wproc: HOST PERFDATA job 18545 from worker Core Worker 192 is a non-check helper but exited with return code 3
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 192: job -284925024 with pid 4387 reaped at timeout. timeouts=118; started=18546
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 192: kill(-4347, SIGKILL) failed: Operation not permitted
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 192: job 18541 (pid=4347): Dormant child reaped
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 192: kill(-4372, SIGKILL) failed: Operation not permitted
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 192: job 18543 (pid=4372): Dormant child reaped
Jan 05 18:53:19 itmgmt nagios[180]: wproc: HOST PERFDATA job 18541 from worker Core Worker 188 timed out after 25.13s
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 188: job 18542 (pid=4368) timed out. Killing it
Jan 05 18:53:19 itmgmt nagios[180]: wproc: SERVICE PERFDATA job 18542 from worker Core Worker 188 is a non-check helper but exited with return code 3
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 188: job 1592714656 with pid 4368 reaped at timeout. timeouts=116; started=18545
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 188: job 18543 (pid=4374) timed out. Killing it
Jan 05 18:53:19 itmgmt nagios[180]: wproc: SERVICE PERFDATA job 18543 from worker Core Worker 188 is a non-check helper but exited with return code 3
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 188: job 1592713904 with pid 4374 reaped at timeout. timeouts=117; started=18545
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 188: job 18544 (pid=4380) timed out. Killing it
Jan 05 18:53:19 itmgmt nagios[180]: wproc: SERVICE PERFDATA job 18542 from worker Core Worker 187 timed out after 25.13s
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 05 18:53:19 itmgmt nagios[180]: wproc: Core Worker 187: job 18543 (pid=4369) timed out. Killing it
Jan 05 18:53:19 itmgmt nagios[180]: wproc: SERVICE PERFDATA job 18543 from worker Core Worker 187 is a non-check helper but exited with return code 3
Jan 05 18:53:19 itmgmt nagios[180]: wproc:   early_timeout=0; exited_ok=1; wait_status=768; error_co


Stoppe ich Nagios sind die Meldungen mal weg. Läuft Nagios normal (ist nur bei einem Reboot möglich) schreibt es das Log auch total voll, war vorher auch nicht so:

Quellcode

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
Jan 07 22:20:11 itmgmt sudo[25421]:   nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/lib/nagios/plugins/check_dhcp -u -s
Jan 07 22:20:11 itmgmt sudo[25421]: pam_unix(sudo:session): session opened for user root by (uid=0)
Jan 07 22:20:11 itmgmt systemd[1]: Started Session c201 of user root.
Jan 07 22:20:11 itmgmt sudo[25421]: pam_unix(sudo:session): session closed for user root
Jan 07 22:20:11 itmgmt nagios[24482]: wproc: SERVICE PERFDATA job 66 from worker Core Worker 24487 is a non-check helper but exited with return code 3
Jan 07 22:20:11 itmgmt nagios[24482]: wproc:   early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
Jan 07 22:20:11 itmgmt nagios[24482]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 07 22:20:11 itmgmt sudo[25425]:   nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/lib/nagios/plugins/check_dhcp -u -s
Jan 07 22:20:11 itmgmt sudo[25425]: pam_unix(sudo:session): session opened for user root by (uid=0)
Jan 07 22:20:11 itmgmt systemd[1]: Started Session c202 of user root.
Jan 07 22:20:11 itmgmt sudo[25425]: pam_unix(sudo:session): session closed for user root
Jan 07 22:20:11 itmgmt nagios[24482]: wproc: SERVICE PERFDATA job 66 from worker Core Worker 24486 is a non-check helper but exited with return code 3
Jan 07 22:20:11 itmgmt nagios[24482]: wproc:   early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
Jan 07 22:20:11 itmgmt nagios[24482]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 07 22:20:11 itmgmt sudo[25427]:   nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/lib/nagios/plugins/check_dhcp -u -s
Jan 07 22:20:11 itmgmt sudo[25427]: pam_unix(sudo:session): session opened for user root by (uid=0)
Jan 07 22:20:11 itmgmt systemd[1]: Started Session c203 of user root.
Jan 07 22:20:11 itmgmt sudo[25427]: pam_unix(sudo:session): session closed for user root
Jan 07 22:20:11 itmgmt nagios[24482]: wproc: SERVICE PERFDATA job 67 from worker Core Worker 24483 is a non-check helper but exited with return code 3
Jan 07 22:20:11 itmgmt nagios[24482]: wproc:   early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
Jan 07 22:20:11 itmgmt nagios[24482]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 07 22:20:12 itmgmt sudo[25432]:   nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/lib/nagios/plugins/check_dhcp -u -s
Jan 07 22:20:12 itmgmt sudo[25432]: pam_unix(sudo:session): session opened for user root by (uid=0)
Jan 07 22:20:12 itmgmt systemd[1]: Started Session c204 of user root.
Jan 07 22:20:12 itmgmt sudo[25432]: pam_unix(sudo:session): session closed for user root
Jan 07 22:20:12 itmgmt nagios[24482]: wproc: SERVICE PERFDATA job 67 from worker Core Worker 24484 is a non-check helper but exited with return code 3
Jan 07 22:20:12 itmgmt nagios[24482]: wproc:   early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
Jan 07 22:20:12 itmgmt nagios[24482]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 07 22:20:14 itmgmt sudo[25439]:   nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/lib/nagios/plugins/check_dhcp -u -s
Jan 07 22:20:14 itmgmt sudo[25439]: pam_unix(sudo:session): session opened for user root by (uid=0)
Jan 07 22:20:14 itmgmt sudo[25440]:   nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/lib/nagios/plugins/check_dhcp -u -s
Jan 07 22:20:14 itmgmt sudo[25440]: pam_unix(sudo:session): session opened for user root by (uid=0)
Jan 07 22:20:14 itmgmt systemd[1]: Started Session c205 of user root.
Jan 07 22:20:14 itmgmt systemd[1]: Started Session c206 of user root.
Jan 07 22:20:14 itmgmt sudo[25440]: pam_unix(sudo:session): session closed for user root
Jan 07 22:20:14 itmgmt sudo[25439]: pam_unix(sudo:session): session closed for user root
Jan 07 22:20:14 itmgmt nagios[24482]: wproc: SERVICE PERFDATA job 68 from worker Core Worker 24484 is a non-check helper but exited with return code 3
Jan 07 22:20:14 itmgmt nagios[24482]: wproc:   early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
Jan 07 22:20:14 itmgmt nagios[24482]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 07 22:20:14 itmgmt nagios[24482]: wproc: SERVICE PERFDATA job 68 from worker Core Worker 24485 is a non-check helper but exited with return code 3
Jan 07 22:20:14 itmgmt nagios[24482]: wproc:   early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
Jan 07 22:20:14 itmgmt nagios[24482]: wproc:   stdout line 01: check_dhcp: Invalid hostname/address -
Jan 07 22:20:15 itmgmt sudo[25450]:   nagios : TTY=unknown ; PWD=/tmp ; USER=root ; COMMAND=/usr/lib/nagios/plugins/check_dhcp -u -s
Jan 07 22:20:15 itmgmt sudo[25450]: pam_unix(sudo:session): session opened for user root by (uid=0)

und auch das in Hunderttausend facher Ausgabe:

Quellcode

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Jan 06 16:54:16 itmgmt sudo[1353]: pam_systemd(sudo:session): Failed to create session: Unterbrechung während des Betriebssystemaufrufs
Jan 06 16:54:16 itmgmt sudo[1352]: pam_systemd(sudo:session): Failed to create session: Unterbrechung während des Betriebssystemaufrufs
Jan 06 16:54:16 itmgmt sudo[1351]: pam_systemd(sudo:session): Failed to create session: Unterbrechung während des Betriebssystemaufrufs
Jan 06 16:54:16 itmgmt sudo[1334]: pam_systemd(sudo:session): Failed to create session: Unterbrechung während des Betriebssystemaufrufs
Jan 06 16:54:16 itmgmt sudo[1350]: pam_systemd(sudo:session): Failed to create session: Unterbrechung während des Betriebssystemaufrufs
Jan 06 16:54:16 itmgmt sudo[1339]: pam_systemd(sudo:session): Failed to create session: Unterbrechung während des Betriebssystemaufrufs
Jan 06 16:54:16 itmgmt sudo[1333]: pam_systemd(sudo:session): Failed to create session: Unterbrechung während des Betriebssystemaufrufs
Jan 06 16:54:16 itmgmt sudo[1332]: pam_systemd(sudo:session): Failed to create session: Unterbrechung während des Betriebssystemaufrufs
Jan 06 16:54:16 itmgmt sudo[1307]: pam_systemd(sudo:session): Failed to create session: Unterbrechung während des Betriebssystemaufrufs
Jan 06 16:54:16 itmgmt sudo[1320]: pam_systemd(sudo:session): Failed to create session: Unterbrechung während des Betriebssystemaufrufs
Jan 06 16:54:13 itmgmt sudo[1253]: pam_systemd(sudo:session): Failed to create session: Activation of org.freedesktop.login1 timed out
Jan 06 16:54:13 itmgmt sudo[1254]: pam_systemd(sudo:session): Failed to create session: Activation of org.freedesktop.login1 timed out
Jan 06 16:54:13 itmgmt sudo[1238]: pam_systemd(sudo:session): Failed to create session: Activation of org.freedesktop.login1 timed out
Jan 06 16:54:13 itmgmt sudo[1255]: pam_systemd(sudo:session): Failed to create session: Activation of org.freedesktop.login1 timed out
Jan 06 16:54:13 itmgmt sudo[1251]: pam_systemd(sudo:session): Failed to create session: Activation of org.freedesktop.login1 timed out
Jan 06 16:54:13 itmgmt sudo[1256]: pam_systemd(sudo:session): Failed to create session: Activation of org.freedesktop.login1 timed out


Das was ich jetzt heute noch versucht habe war den Dienst "systemd-logind" neu zustarten: http://serverfault.com/questions/707377/…ogin1-timed-out
Tatsächlich tut es jetzt wieder normal. Nur wie lange. Ist wohl was ziemlich broken.

lg
boospy
Gentoo Can Do!

Wiki auf: http://deepdoc.at

Dieser Beitrag wurde bereits 1 mal editiert, zuletzt von »boospy« (07.01.2016, 22:59)