Anybody else noticing a very noticeable delay in pages being displayed from the forum? By noticeable I mean like 30 to 60 seconds for a page to load, even the login page. I am not noticing any delays in other sites I visit. Wondering if this is a forum issue or my vpn service.
It is OK for me, usually only a few seconds as is my other discourse forum I frequent. I keep logged in so only rarely visit the login page.
Stuart
The two deprecation warnings have been reported. There is a major rework of the guts of Discourse underway and those warnings should be addressed by that work.
The other errors look like Google Analytics. If you have an ad or tracking blocker turned on then you will get those errors but they shouldn’t be having any effect on access. If you have them blocked then the system just moves on accepting that you’re blocking them.
Otherwise, I don’t know what may be causing it. I’m not seeing response times any different to usual at the moment.
Today I was able to login to the forum readily with no delay. My VPN does block ad’s as well as Bitdefender blocks ad’s so I’m sure those are probably “normal” for me. I’m getting too technology outdated to do much troubleshooting, so before I go chasing issues I like to first check to see if anybody else is having issues.
We now return you to your regular forum postings…
I’m still not seeing poor response times in the forum, but there is some background disk activity going on that might be causing things to run slow at times. It’s possible that when the activity is working on disk areas that the forum uses then the forum slow down, but is OK at other times.
Just wanted to expound on the issue, when I say [quote=“dcrooks, post:1, topic:72711”]
very noticeable delay[/quote] means 30 to 60 seconds, not what somebody would define as a poor response time. But as I mentioned earlier, today it was not noticed.
I agree 30-60 seconds goes beyond poor performance. The RAID array is rebuilding itself for some reason and that’s a long/slow process.
I thought that a rebuild usually happened after a hard drive is replaced
That’s one reason, but not in this case as I’ve not had a drive failed. SMART (*) is suggesting the physical drives are OK. I suspect that the RAID subsystem has detected something that looks odd and decided the best way to resolve it is to rebuild the array.
There are 4 RAID partitions in total. 2 are clean and running properly. One is marked clean but has a delayed resync almost certainly due to the check taking place on the fourth partition. The fourth partition is marked clean but is being checked which I need to do some research on because I’d have thought those states were mutually exclusive! Unfortunately the fourth partition is the biggest one so it will take longest to check. Only about 3 hours left now though - it seems to have speeded up.
I’ll need to trawl through lots of log files to see if I can find the original cause of the rebuild.
(*) SMART actually gives some interesting (geek) facts. The drives have been running for 33892 hours (just short of 4 years) and in that time I’ve power cycled the server 17 times, or about once every 3 months.
I now know why the RAID array check started…it’s scheduled (cron) to run at 01:16 on the 12th of the month and has been doing so since at least 12th March 2023. I’ve never seen it take this long before so I assume it must have found something it didn’t like this time, but what that was doesn’t seem to be logged anywhere. So other than waiting for it to finish I’m stumped about any subsequent steps I can take.
I’m old enough to know better than to believe progress bars. It did say 3 hours left. 4 hours later it now says 2 hours left. I’m sure it’s not going to be 2 hours.
If you’re not a WxSim user you probably won’t read this post which explains slowness yesterday - https://discourse2.weather-watch.com/t/mcmahon-gfs-2023121218-z-missing-last-was-2023121212-z/72721/4?u=administrator
This doesn’t explain the very slow response before that, although it’s possible that the server was trying to access a part of the disk that was inconsistent and so took longer than normal to respond. Fingers crossed that everything is back to normal now.
I’ve just looked and the checking processes that were slowing things down have completed. The RAID array is now happy again - clean and in sync.
This reminds me of a similar incident in professional life many years ago. Windows used to run CHKDSK on startup, maybe it still does, and if it found problems it set about fixing them immediately. When everything was fixed Windows completed the boot sequence and finished loading everything up. That’s bad enough on a PC, but when it happens on a server with a big RAID array attached that’s shred by many people then it’s definitely not good. You can’t really interrupt CHKDSK without making things even worse, so we had to let it run to completion…over 24 hours if I remember correctly. After that we set CHKDSK to only report issues and not fix them. We also implemented a creative way to resolve any issues we found by copying the affected partitions to a spare array before failing the old array over to the copy and then reformatting the original array. It’s a good job we had disaster reocvery plans in place with extra hardware to allow them to be implemented.
Forum’s been running very slow this evening. . . for the last 5 minutes, anyway.
I caught one of these myself this evening. I have no idea what caused it though. The server wasn’t loaded. The disks are fine (no checking/rebuilds). No process was showing up as hogging all the CPU. After the long wait finished everything seemed to go back to normal with no lingering effects. I’ll watch for this again but it’s going to be difficult to debug with seemingly random timings of when it appears.
It has been at times because I’ve been busy reloading plugins and changing settings to try to get the translation working. The slow 60+ seconds time I saw was when I know none of that was happening. At other times it will have gone non-responsive for you, or very slow as updates were being applied.
Sorry about this…hopefully it will be resolved tomorrow and we can go back to more quiet times!
Probably just me but today the forum is laggy…every click takes 15-20 seconds to get a response. Not a biggie just wanted to let you know in case you want to check the logs or something…not sure why I always seem to have issues with the forum.
Pareil me concernant.
No issues here today. However I am in the UK so closer to the server!
Stuart
I’ve had a busy day but I’ve done a quick scan of the logs. Nothing stands out though. No obvious errors or excessive log entries. I don’t currently have the capability of long-term monitoring of CPU/memory/network bandwidth, but CPU is rarely over 5 %, memory is usually 15-20% usage and I have bandwidth warnings in place which haven’t triggered.
Keep letting me know when you see these lags. Maybe something will eventually show up in the logs.
I don’t know what the connections between the UK and Finland are like…it’s possible in network terms that we could be further away than the US. I’m just waiting for the local Internet to be connected up to the new data link that comes ashore a few miles away from me. It’s apparently capable of carrying one third of all the global Internet traffic!