Looking for help on ancient Vicidial server - Random Pausing

All installation and configuration problems and questions

Moderators: gerski, enjay, williamconley, Op3r, Staydog, gardo, mflorell, MJCoate, mcargile, Kumba, Michael_N

Looking for help on ancient Vicidial server - Random Pausing

Postby tomatogoatee » Fri Feb 26, 2021 3:21 pm

Hello, everyone.

We have a very old Vicidial server we have been using for a VERY long time. And I understand everyone's knee-jerk response is going to be, "You need to upgrade." I accept that, but I would really appreciate any kind of insight as to why, after over a decade of running, the server is suddenly pausing agents randomly throughout the day.

First, the server details. This is all running on a Dell 1950. Without the use of LSHW on the install itself, I can only say that it has 2 4-core processors, 16GB of ram, and 165GB disk space free. It has a cron job scheduled to reboot every morning at 3am. The OS is openSUSE 12.1, Asterisk 1.4.44-vivi, Vicidial version 2.6-395a, build 130221-1736. All carriers are connected through SIP.

Second, the agent hardware. All agents are using Polycom either SP 300 or 330 phones. Our service center (by far the most users) are all using Ubuntu thin clients and accessing Vicidial with Firefox. The remaining users are using either Windows 7 or 10 workstations and either Chrome or Firefox as their browser.

As I mentioned, the problem is agents are getting randomly paused. (Meaning they either see the, "Your session has been paused" alert or, in many cases, they get no indication and it's only when the real-time screen is checked that you can see them paused.) This started happening last Thursday, February 18. It started as only 2 or 3 agents. My initial suspicion was that there was another user attempting to log in using one of the other agent's IDs. That was quickly proven wrong. Over the course of the week, more and more agents are suffering from random pausing. Initially, it was limited to Windows users, as everyone using our Ubuntu thin clients were seemingly immune. Yesterday, however, even the Linux users are experiencing it. We've run the full gamut of web browsers, IE, Edge, Firefox, and Chrome, but it seems to have no bearing on the issue.

I've searched the internet and this forum for any clues and (from what limited information I could find) I've attempted the following troubleshooting:
Scanned the mysql database for any table corruption. (None found.)
Time sync issues. (Seemingly none, as all servers and workstations are set to sync to a singular server on our network. When checked, all machines seem to be within 1 second of each other.)
Network traffic. (I've consulted our NMS logs and, as far as I can determine, there has been no irregular spikes in traffic. To be honest, the trends seem to show there's been LESS traffic than normal.)
Misconfigured campaign. (Again, everything has been running smoothly for over a decade. But I rebuilt one of the campaigns and even created a brand new user to see if that might resolve the issue. It did not, as the new user also started suffering the pausing issue almost immediately.)

I've tried looking through the log files (vdautodial.<date> in particular) for any kind of information. While the log does show what time a given agent got paused, it doesn't exactly say what agent, extension, campaign, ingroup, or anything really. I've taken the times it lists and checked against the vicidial_user_log table in the database to find out who and what campaign. Sadly, though, it doesn't suggest what caused the pause in the first place. Oddly enough, even though the log says, "lagged call vla agent PAUSED," the user log says they were actually logged out.

Here is a snippet of the vdautodial log where an agent was paused.
Code: Select all
2021-02-26 08:29:02|SERVER CALLS PER SECOND MAXIMUM SET TO: 20 |50||
2021-02-26 08:29:02|LIVE AGENTS LOGGED IN: 1   ACTIVE CALLS: 7|
2021-02-26 08:29:02|OLD TRUNK SHORTS CLEARED: 1 |'','NEW_SVC'||
2021-02-26 08:29:02|NEW_SVC 192.168.10.205: agents: 0 (READY: 0)    dial_level: 1     (0|1|0)   -4|
2021-02-26 08:29:02|NEW_SVC 192.168.10.205: Calls to place: -7 (0 - 7 [7 + 0|7|2]) 7 |
2021-02-26 08:29:02|CAMPAIGN DIFFERENTIAL: 2.85   0.3   (0.65 - 0.35)|
2021-02-26 08:29:02|LOCAL TRUNK SHORTAGE: 0|0  (0 - 23)|
2021-02-26 08:29:02|NEW_SVC 2: INBOUND QUEUE NO DIAL, NO DIALING|
2021-02-26 08:29:02||     dead call vac INBOUND do nothing|4385303|9122281772|CLOSER||
2021-02-26 08:29:02||     dead call vac INBOUND do nothing|4385308|8132634570|CLOSER||
2021-02-26 08:29:02||     dead call vac INBOUND do nothing|4385317|9049938721|CLOSER||
2021-02-26 08:29:02||     dead call vac INBOUND do nothing|4385318|9049030349|CLOSER||
2021-02-26 08:29:02||     dead call vac INBOUND do nothing|4385331|9049555911|CLOSER||
2021-02-26 08:29:02||     logindate UPDATED 1|'NEW_SVC'||
2021-02-26 08:29:05|SERVER CALLS PER SECOND MAXIMUM SET TO: 20 |50||
2021-02-26 08:29:05|LIVE AGENTS LOGGED IN: 1   ACTIVE CALLS: 7|
2021-02-26 08:29:05|OLD TRUNK SHORTS CLEARED: 1 |'','NEW_SVC'||
2021-02-26 08:29:05|NEW_SVC 192.168.10.205: agents: 0 (READY: 0)    dial_level: 1     (0|1|0)   -4|
2021-02-26 08:29:05|NEW_SVC 192.168.10.205: Calls to place: -7 (0 - 7 [7 + 0|7|2]) 7 |
2021-02-26 08:29:05|CAMPAIGN DIFFERENTIAL: 3.05   0.15   (0.6 - 0.45)|
2021-02-26 08:29:05|LOCAL TRUNK SHORTAGE: 0|0  (0 - 23)|
2021-02-26 08:29:05|NEW_SVC 2: INBOUND QUEUE NO DIAL, NO DIALING|
2021-02-26 08:29:05||     dead call vac INBOUND do nothing|4385303|9122281772|CLOSER||
2021-02-26 08:29:05||     dead call vac INBOUND do nothing|4385308|8132634570|CLOSER||
2021-02-26 08:29:05||     dead call vac INBOUND do nothing|4385317|9049938721|CLOSER||
2021-02-26 08:29:05||     dead call vac INBOUND do nothing|4385318|9049030349|CLOSER||
2021-02-26 08:29:05||     dead call vac INBOUND do nothing|4385331|9049555911|CLOSER||
2021-02-26 08:29:05||     lagged call vla agent PAUSED 1|1|20210226082835|20210226082855|20210226082905||
2021-02-26 08:29:05||          lagged agent LOGOUT entry inserted 7256|INBOUND|||
2021-02-26 08:29:05||     logindate UPDATED 1|'NEW_SVC'||
2021-02-26 08:29:05||     updating server parameters 23|8365|-5|default||
2021-02-26 08:29:07|SERVER CALLS PER SECOND MAXIMUM SET TO: 20 |50||
2021-02-26 08:29:07|LIVE AGENTS LOGGED IN: 1   ACTIVE CALLS: 7|
2021-02-26 08:29:07|OLD TRUNK SHORTS CLEARED: 1 |'','NEW_SVC'||
2021-02-26 08:29:07|NEW_SVC 192.168.10.205: agents: 0 (READY: 0)    dial_level: 1     (0|1|0)   -4|
2021-02-26 08:29:07|NEW_SVC 192.168.10.205: Calls to place: -7 (0 - 7 [7 + 0|7|2]) 7 |
2021-02-26 08:29:07|CAMPAIGN DIFFERENTIAL: 3.25   0   (0.55 - 0.55)|
2021-02-26 08:29:07|LOCAL TRUNK SHORTAGE: 0|0  (0 - 23)|
2021-02-26 08:29:07|NEW_SVC 2: INBOUND QUEUE NO DIAL, NO DIALING|
2021-02-26 08:29:07||     dead call vac INBOUND do nothing|4385303|9122281772|CLOSER||
2021-02-26 08:29:07||     dead call vac INBOUND do nothing|4385308|8132634570|CLOSER||
2021-02-26 08:29:07||     dead call vac INBOUND do nothing|4385317|9049938721|CLOSER||
2021-02-26 08:29:07||     dead call vac INBOUND do nothing|4385318|9049030349|CLOSER||
2021-02-26 08:29:07||     dead call vac INBOUND do nothing|4385331|9049555911|CLOSER||
2021-02-26 08:29:07||     logindate UPDATED 1|'NEW_SVC'||


I will gladly provide any additional information that may help to find the cause of the problem. In the mean time, I have a ghosted image of this server from 2018 that I'm going to attempt to restore, but I would really like to carry the database over so I wouldn't be stuck recreating a couple hundred users and a dozen or so routes.

One other thing that I only noticed in the log files today, the pausing seems to only happen between just before 8am and never after 8pm.
tomatogoatee
 
Posts: 4
Joined: Thu Feb 25, 2021 3:05 pm

Re: Looking for help on ancient Vicidial server - Random Pau

Postby carpenox » Fri Feb 26, 2021 4:01 pm

If you are experiencing the d/c due to lag, it would lead me to believe this was network/LAN related or perhaps firewall issues. This server is hosted locally in a call center with all the agents local or they are all over? Did your host change their firewall?
Alma Linux 9.3 | SVN Version: 3822 | DB Schema Version: 1711 | Asterisk 18.18.1
www.dialer.one -:- 1-833-DIALER-1 -:- https://linktr.ee/CyburDial -:- WhatsApp: +19549477572 -:- Skype: live:carpenox_3 | Discord: https://discord.gg/DVktk6smbh
carpenox
 
Posts: 2247
Joined: Wed Apr 08, 2020 2:02 am
Location: St Petersburg, FL

Re: Looking for help on ancient Vicidial server - Random Pau

Postby mflorell » Fri Feb 26, 2021 11:34 pm

How many calls per day?

Are you archiving the log tables at all?

Do you have slow query logging enabled on MySQL?
mflorell
Site Admin
 
Posts: 18338
Joined: Wed Jun 07, 2006 2:45 pm
Location: Florida

Re: Looking for help on ancient Vicidial server - Random Pau

Postby alo » Sat Feb 27, 2021 2:34 pm

We had this happen after the last chrome update.
You saying "by checking the real-time" makes me thing these agents have the real-time report active instead the agent interface, which means their Agent interface is considered inactive by chrome.

chrome://flags/#intensive-wake-up-throttling we have been Disabling that and it seems to solve our issue.
alo
 
Posts: 189
Joined: Wed Jun 20, 2012 10:21 am

Re: Looking for help on ancient Vicidial server - Random Pau

Postby carpenox » Sun Feb 28, 2021 7:30 am

after chrome 87 or 88? you said last so im guessing you meant 88, try to install these packages and see if it helps the issue:

zypper in ncurses-devel libxml2-devel sqlite-devel libsrtp-devel libuuid-devel openssl-devel
Alma Linux 9.3 | SVN Version: 3822 | DB Schema Version: 1711 | Asterisk 18.18.1
www.dialer.one -:- 1-833-DIALER-1 -:- https://linktr.ee/CyburDial -:- WhatsApp: +19549477572 -:- Skype: live:carpenox_3 | Discord: https://discord.gg/DVktk6smbh
carpenox
 
Posts: 2247
Joined: Wed Apr 08, 2020 2:02 am
Location: St Petersburg, FL

Re: Looking for help on ancient Vicidial server - Random Pau

Postby tomatogoatee » Mon Mar 01, 2021 9:34 am

Thank you all for replying. I posted on Friday and was not in the office over the weekend, so I couldn't answer any of your follow-up questions. I will try to address them all now.

carpenox wrote:If you are experiencing the d/c due to lag, it would lead me to believe this was network/LAN related or perhaps firewall issues. This server is hosted locally in a call center with all the agents local or
they are all over? Did your host change their firewall?


The server is hosted locally. It is a physical machine (not VM) and there is no intermediary firewall between buildings/switches.

mflorell wrote:How many calls per day?

Are you archiving the log tables at all?

Do you have slow query logging enabled on MySQL?


I ran reports for the server and came back with calls per day from 2-15 to 2-28:
Code: Select all
2/15: 1192
2/16: 1513
2/17: 1645
2/18: 1323 (The day we started seeing the problem)
2/19: 1300
2/20: 380
2/21: 210
2/22: 1293
2/23: 1297
2/24: 1184
2/25: 1280
2/26: 1421 66 paused
2/27: 693 19 paused
2/28: 261 26 paused

(I would have listed the previous days pause counts, but did not think to backup the log files before they were rotated out.)
We are not doing any manual archiving of the log tables. There are no cron jobs scheduled to do any sort of archiving, either. There are several .pl scripts scheduled that (given the name) are doing regular DB maintenance, but I believe they are all stock scripts. Nothing we manually implemented.
Slow query logging is enabled. The log file is over 90MB. If you would like to see it, I will upload it to a google drive folder. Unless there is something specific I should look for in it.

alo wrote:We had this happen after the last chrome update.
You saying "by checking the real-time" makes me thing these agents have the real-time report active instead the agent interface, which means their Agent interface is considered inactive by chrome.

chrome://flags/#intensive-wake-up-throttling we have been Disabling that and it seems to solve our issue.


Like I mentioned, our users have tried different browsers. However, I did a little research and the first few people to start experiencing this were, in fact, using Chrome exclusively. I will attempt this fix, but that leaves me at a loss for the agents using Firefox on the Linux side. They are using version 52.0.2 and are locked to that because the Ubuntu LTSP server is 12.04, which hasn't offered any updates in years. But that version has been working flawlessly for so long that I highly doubt there is an issue with any settings on that side.

carpenox wrote:after chrome 87 or 88? you said last so im guessing you meant 88, try to install these packages and see if it helps the issue:

zypper in ncurses-devel libxml2-devel sqlite-devel libsrtp-devel libuuid-devel openssl-devel


I'm going to attempt Alo's fix first. I'm hesitant to install new packages on a server this old before I've had a chance to back it up. If his suggestions seem to have no impact, I'll consider this.

Thank you all for your suggestions so far. In the mean time, I have gotten ahold of a new server and, as a contingency, will begin installing a new, latest-version Vicibox machine. I'm still hoping to resolve the issues plaguing the current install, as re-creating all the users, campaigns, ingroups, extensions, etc. is a very time-consuming task.
tomatogoatee
 
Posts: 4
Joined: Thu Feb 25, 2021 3:05 pm

Re: Looking for help on ancient Vicidial server - Random Pau

Postby mflorell » Mon Mar 01, 2021 2:54 pm

As for the archiving of logs, I'd suggest taking a look at the "ADMIN_archive_log_tables.pl" perl script, even if you want to set it to something like "--months=48", you may still gain some performance given how long this system has been up.

As for the slow query log, no need to post the whole thing, but since you have one, you can look at the exact date/time you are experiencing these issues and see if there may be a DB slowdown happening around that time.
mflorell
Site Admin
 
Posts: 18338
Joined: Wed Jun 07, 2006 2:45 pm
Location: Florida

Re: Looking for help on ancient Vicidial server - Random Pau

Postby jrecord57 » Tue Mar 09, 2021 11:49 am

Has there been any solutions to this yet? I have a system that only been running for less that a year that was installed with all the current updates. Everything working fine till about a week ago.

See I have three managers on payroll but unfortunately have had to make some cutbacks. In order for me to be able to afford to keep them on, they have all agreed to get back on the phones and help with production. These are the only ones that are experiencing this issue. Someone posted above about them also having the Real-Time Reporting screen up and this was causing Chrome to pause Javacripts or something in the background/inactive tabs they have open and thus pausing the agent (I am paraphrasing from memory but it was something to that affect). I thought this was definitely the issue because they all do LOVE to monitor the real-time report so I followed the easy fix the person suggested. Seemed to help at first but seems as though the problem has arisen again. For now I told them to just close out the real-time when they are on the phone, but they REALLY want to be able to have both so I was just checking to see if anyone has found a solution to this issue.

I even suggested switching browsers but idk if it really is a Chrome thing, or if all browsers have a similar setting. Either way though I disabled it in Chrome so if that was the issue, that should have fixed it. Anyways, any suggestions are greatly appreciated, thanks in advance for the assistance.
ViciBox 9.0.3 from .iso | ViciDial VERSION: 2.14-834a BUILD: 211208-1646 | Asterisk 13.34.0-vici | 3 Server Cluster (DB, Web, Telephony) | No Digium/Sangoma Hardware | No Extra Software After Installation | Xeon E3-1230 v6 Quad-Core 3.5 GHz
jrecord57
 
Posts: 9
Joined: Sun Sep 13, 2020 10:32 am

Re: Looking for help on ancient Vicidial server - Random Pau

Postby mflorell » Thu Mar 18, 2021 1:58 am

We've just added some new features that can mitigate the recent Chrome Javascript Throttling issues.

The new System Settings for "Agent Hidden Browser Sound" can be set to play a quiet soundfile on the agent web browser every 20 seconds that the agent screen is hidden from view, which will deactivate the Chrome browser Javascript Throttling.

If you would like to try this, upgrade to svn/trunk revision 3390 and configure "Agent Hidden Browser Sound Seconds" to '20', then log in as an agent and move to a different browser tab to test.
mflorell
Site Admin
 
Posts: 18338
Joined: Wed Jun 07, 2006 2:45 pm
Location: Florida

Re: Looking for help on ancient Vicidial server - Random Pau

Postby carpenox » Thu Mar 18, 2021 8:29 am

beast mode, fuck off chrome! haha i love it
Alma Linux 9.3 | SVN Version: 3822 | DB Schema Version: 1711 | Asterisk 18.18.1
www.dialer.one -:- 1-833-DIALER-1 -:- https://linktr.ee/CyburDial -:- WhatsApp: +19549477572 -:- Skype: live:carpenox_3 | Discord: https://discord.gg/DVktk6smbh
carpenox
 
Posts: 2247
Joined: Wed Apr 08, 2020 2:02 am
Location: St Petersburg, FL

Re: Looking for help on ancient Vicidial server - Random Pau

Postby tomatogoatee » Fri Mar 19, 2021 2:40 pm

First off, let me apologize for not replying to this thread sooner. I intended to reply with the result of the fix after three days, but then got distracted with other (non vicidial related problems).

Alo's suggestion of disabling the intensive-wake-up-throttling flag in Chrome worked famously! Not entirely sure what was causing the problems with our thin clients using Firefox, but they stopped having issues as soon as we set that flag on all the Chrome users.
tomatogoatee
 
Posts: 4
Joined: Thu Feb 25, 2021 3:05 pm


Return to Support

Who is online

Users browsing this forum: No registered users and 88 guests