Troubleshooting Technology Problems
Many problems that occur on a network are easily solved by the technicians. Often problems can be so perplexing that other parties need to get involved. It is important to have a good process as well as good problem-solving skills to solve such problems. Below are some perplexing problems that occured during the recent network upgrade in my district.
Problem Description |
Steps Taken |
Final Diagnosis |
SLOW INTERNET This February our district upgraded the network and the phone system and instituted a pilot of a new web filter. Although our district uses their help desk for most technology issues, our network administrator left the district just after this upgrade therefore during the time of the change-over, the technology director asked some of the teachers to track some problems they were having. She reported that during this type of upgrade, technology problems often go unnoticed or unreported and she wanted her "key users" to be a part of the troubleshooting process. Upgrades:
The week following the upgrade, I began to notice EXTREMELY slow internet service. I was unable to run any streaming video even though we had a portion of our network dedicated for this purpose. Any sites that were graphic intensive or included sound either returned an error or would not load.
|
Because my building lies on the outer edge of the network the first step was trying to assess from top—is this a real problem (or perceived by the user), is it an individual problem (user machine only), is it a building problem (isolated to this periferal building) or is it a network issue (happening over the whole network).
Once it was determined that this was a widespread problem—the next step was to look at each recent changes to isolate what may be causing the problem. Wide Area Network Upgrade: within WAN, the technicians ran reports to confirm that bandwidth was running full speed.
Internet Service Provider Upgrade: Because it was confirmed that the WAN was not running at full speed, a call was made to Comcast (internet service provider) to find out from their end if bandwidth was being limited.
After the ISP was tested, it was determined that the error must be within our network. Most of the district was still running under BESS as a content filter and that tested as ok. Package shaping tests determined that internet traffic was prioritized for 10 meg. The next step was to examine the firewall pix because during the changeover made pix config changes and the computer technicians wanted to confirm that nothing was improperly changed. It was determined that there were no problems in this area. It was established that they would need to isolate the different software packages by pulling things "offline" until the nework was back up to speed.
|
It was determined that the upgrade of the packet shaping sofware created a bottleneck in internet traffic. The day of the network upgrade, the packet shaper was upgraded as well. When that happened, something went wrong and created a bottleneck. So the upgrade was reapplied and the network began to run at full speed. The one problem that still existed and was affecting internet speed, was that the district server was not still caching web sites. Caching was planned with the Blue Coat implementation (see error # 2 below) |
UNITED STREAMING:
User Authentication Error
For many years, our district used Bess web filtering. We were concerned at the number of sites that were restricted because of their category. The technology committee decided to institute a new “on-proxy” web filtering solution called Blue Coat which would allow for greater customization of our Internet filtering policy. Since I was involved in the slow internet process, the technicians asked me to be a pilot user to identify if there were any conflicts with Blue Coat in common internet applications before rolling out to the district. The first thing I discovered was that that the internet worked MUCH more quickly, because the proxy server began to cache websites for the district. However whenever I attempted to log on to United Streaming, I would get the inital screen, but if I clicked on any links, I would get "bounced back" and asked to log in again. I couldn't access the program the way it was intended.
|
Once I reported the error to the help desk, they asked me to try to log in to other web based applications and report any applications that were causing the same or similar problems. The two applications that I had difficulty with were United Streaming and Kidbiz. Once they identified the problem on my machine, they once again set out to determine whether this was isolated to my machine, or widespread throughout the building / district. There were 2 separate variables to test: Individual computer location, and individual user log-on.
The trouble shooting was not a simple process: After consulting with Blue Coat Support system, it was determined to be an authentication error through each of the applications. The tech department began by by isolating the United Streaming application first. They have good customer service and it is a widely used application throughout the district. It was essential to get this problem resolved before rolling the new Blue Coat Web Filter to the district. We were sure that if we could determine how users were authenticated using United Streaming, we would know how to proceed with other applications. The first person contacted was tech support at United streaming. It was discovered that United Streaming authenticates users through a different portal. One which the district proxy server was set to block. The computer technician then consulted with Blue Coat since this was an application that they were able to resolve it by unblocking the site used to authenticate users. This same process needed to be followed for any site that authenticated users in a similar way.
|
Once the proxy server was unblocked the portal in Blue Coat it was no longer a problem. Since this was NOT JUST A United Streaming problem but involved other portal sites: SchoolWires, KidBiz3000 where user authentication was handled in similar fashions a plan needed to be developed so that the end user was aware of the problem, knew how to report it so that it could be easily solved after Blue Coat was applied to the entire district. FOR FUTURE:
Once Blue Coat is rolled out in the district, users are going to be advised through their Technology Committee reps to report authentication errors to help immediately. It will also be listed in the Knowledge Base of the new help desk.
|
For quite some time, e-mail (over the weekend especially) seemed to be delayed. This problem was increasingly more noticable after our network administrator left. One weekend, I was working with a collegue who was sending e-mail to my district account that never got delivered. On a different occasion, I noticed several e-mails from parents and work from students that were sent over the weekend but didn't arrive until well into the school day on Monday. Since this problem was occuring more regularly, I reported to the technology department via a phone call as 3 days had passed with no e-mail. |
This was an example of a situation where the technology director didn’t realize because it went unreported. The former network administrator was aware of the problem, but since there were other issues that were more of a priority, he did not attempt to troubleshoot to solve it. Since it was unreported it never came to the attention of the other network technicians or the technology director. How he was managing the e-mail was when the incoming mail was sluggish, he would force the cue and reset the server. Once the network administrator had left, this problem became more prevalent because no one was there to monitor the problem. There were actually 3 days where no email was delivered. The problem was identified when examining the incoming and outgoing mail servers. The incoming (Pluto) and outgoing (Mickey) servers are different. Both servers were being used for other applications and there was not much memory on either one of them. If cue grew large real quickly, as it often does with SPAM mail, the e-mail service would fail. The reason was it was exceeding memory requirement. It failed and held all of the messages to stay in cue in order not to drop any mail messages. In order to deliver the cued messages, the technicians forced the cue through and reset the server as had been done in the past, but now that the problem had been identified, they came up with 2 short term solutions and 1 long term solution to the problem. |
This solution needs to be examined in terms of the short term and long term "fixes" SHORT TERM Monitor the server:
There is a technician assigned to monitor the email server every night / every morning. Add memory:
Once memory is added, the problem should lesson however, monitoring will still be necessary to plan for the long term fix.
LONG TERM Reallocating services
There are currently too many services on the email server. However this will require a major “clean up" of the servers. The director is planning on waiting until a new network administrator is installed before initiating this solution.
|