|The recent growth of the Internet has far exceeded the expectations of all experts, network providers and users the world over. Companies that two years ago had not heard of a web browser, now have departments for establishing and managing a company's presence and or intranets. The current Internet infrastructure has no resemblance whatsoever to the modest Internet that existed even as late as 1992. Where is the net heading? Will it crumble under it's own weight? The pitfalls that lie ahead for companies looking to get an early start on this entirely new medium are many and varied. Fortunately the rewards are even greater. Without question one of the biggest issues for infopreneurs is bandwidth availability and savvy network management; in essence, speed! With literally tens of thousands of websites going online daily, a company should consider speed and availability to be primary goals for a successful net presence. Most sites will only get one shot at attracting a user and a slow web site means almost certain death on the net. No amazing content or graphics can compensate for a long wait on the other end of the line.
What makes a site slow or fast?
Slow access to your site can be caused by any of six major problem areas. They include the routing, the MAEs, the Webserver hardware or set up, the ISP's Bandwidth, the Local Area Networks, and the graphics and CGI that make up a site. I will take some time to explain each of them in light detail. It is important to note that many of the issues addressed are interdependent. Although you may have eliminated five of the problem areas, it is still very possible, in fact likely, that you will have a very slow site if even one area is at issue. You truly need to address all of these areas in order to have a successful site.
Routing and Peering
What is routing? Why is it important and how does it affect the speed of my web site?
IP Routing is to the Internet what voodoo is to religion, a dark science that few understand, and that can be very dangerous if misused. It is an amazing feat that we don't have more network outages daily due to routing. Recently a small Internet provider brought about 40% of the net to a halt when they fed their upstream provider (Sprint) faulty routing information, which effectively routed ALL Internet traffic through their router. It wasn't even that hard to do. Simply, you want to host your Internet services with a provider that has good solid routing. Routing is not something that people should be experimenting with, nor should it change often. Traceroute is a powerful tool for the novice that wants to see how their traffic is routing. Here are some examples with simple explanations. Below are three traceroutes to three backbone providers. They are all being done from the same workstation, in this case a Pentium 200 running a flavor of BSD Unix. It is obvious that the provider we are using for this traceroute has at least four backbone providers as all of these traceroutes start on a single core router but each goes out a different path. (see number 2 in each traceroute) The times in ms shown after the IP addresses are the time from the host to that IP address. It is not the cumulative amount of all of the hops, but the time from the start-host to the end-host.
It is often possible to see what kind of connection is in place by looking at the routing names. For example line 2 in the first traceroute says ds3. It is most likely that this indicates a ds3 is a T3 or 45mb connection. DS1 normally means a T1 or 1.54mb connection, HSSI (Hight Speed Serial Interface) normally indicates a DS3 of 45mb connection. If you see "ATM" it most likely indicates a 155mb connection as does "OC3", both of these are used only by the major providers.
bmw> traceroute www.agis.net
traceroute to webserver.agis.net (184.108.40.206), 30 hops max, 40 byte packets
1 wdc-5-eth0.wdc.dn.net (220.127.116.11) 20 ms 0 ms 10 ms
2 agis-ds3-dn.washington2.agis.net (18.104.22.168) 10 ms 10 ms 0 ms
3 ga02e.washington4.agis.net (22.214.171.124) 20 ms 10 ms 10 ms
4 ga007.chicago3.agis.net (126.96.36.199) 40 ms 20 ms 30 ms
5 a0.1010.dearborn2.agis.net (188.8.131.52) 30 ms 30 ms 40 ms
6 webserver.agis.net (184.108.40.206) 30 ms 30 ms 40 ms
bmw> traceroute www.cais.net
traceroute to www.cais.net (220.127.116.11), 30 hops max, 40 byte packets
1 wdc-5-eth0.wdc.dn.net (18.104.22.168) 20 ms 0 ms 0 ms
2 100mb-Lst.cais.net (22.214.171.124) 20 ms 10 ms 10 ms
3 Lst-to-McLean-DS3.cais.net (126.96.36.199) 10 ms 0 ms 10 ms
4 www.cais.com (188.8.131.52) 10 ms 0 ms 10 ms
bmw> traceroute www.mci.com
traceroute to www.mci.com (184.108.40.206), 30 hops max, 40 byte packets
1 wdc-5-eth0.wdc.dn.net (220.127.116.11) 10 ms 0 ms 10 ms
2 mae-east-plusplus.washington.mci.net (18.104.22.168) 10 ms 10 ms 10 ms
3 core2-hssi2-0.Washington.mci.net (22.214.171.124) 10 ms 20 ms
4 core2.NorthRoyalton.mci.net (126.96.36.199) 40 ms 50 ms 60 ms
5 border8-fddi-0.NorthRoyalton.mci.net (188.8.131.52) 50 ms 60 ms 50 ms
6 inp-mci.NorthRoyalton.mci.net (184.108.40.206) 60 ms * *
bmw> traceroute to www.uu.net (220.127.116.11), 30 hops max, 40 byte packets
1 wdc-5-eth0.wdc.dn.net (18.104.22.168) 10 ms 0 ms 0 ms
2 659.Hssi4-0.GW1.TCO1.ALTER.NET (22.214.171.124) 20 ms 10 ms 0 ms
3 421.atm10-0.cr2.tco1.alter.net (126.96.36.199) 0 ms 30 ms 10 ms
4 312.atm3-0.br2.tco1.alter.net (188.8.131.52) 10 ms 10 ms 10 ms
5 Hssi1-0.CR2.DCA1.Alter.Net (184.108.40.206) 20 ms 10 ms 30 ms
6 Hssi2-0.GW2.FFX1.Alter.Net (220.127.116.11) 250 ms 260 ms 270 ms
7 UUNET7-GW.UU.NET (18.104.22.168) 20 ms 10 ms 30 ms
8 charlotte01.va.pubnix.com (22.214.171.124) 20 ms 20 ms 20 ms
Ask your provider if they run Border Gate Protocol (BGP4) and have a correctly registered Autonomous System Number (ASN). BGP is used at a provider that has more then one access point to the net. If BGP and the ASN number are setup correctly, the ISP can offer real redundancy to your Internet servers. However, if the provider has two connections to the same upstream provider, the correct level of redundancy does not exist. If the upstream provider is off net, so are both of the downstream providers connections. BGP, when correctly implemented will allow providers to manage paths to specific IP networks. The ASN is what announces to the world what networks, or "cidr" blocks that an ISP handles. In the ideal situation a lease line failure should result in the BGP routing session to close on the bad router and the other router on the working circuit to now start accepting traffic for the downed networks. When correctly setup, this change requires no human intervention and can actually happen without noticeable network downtime.
Few people really understand bandwidth, and some ISPs perpetrate huge misrepresentations in regards to "their bandwidth" some due to egos and others just to a lack of knowledge or acceptance of myths. To set the record straight, it is very possible that access to a website housed at an ISP with a single T1 (1.54mb) will be significantly faster than the same website located at a provider with a T3 (45mb). How can this be? Well, bandwidth is simply a pipe to the Internet. What we are looking for in bandwidth, is "available bandwidth", a number that few providers know, and even fewer are willing to give out. For example if one Internet provider with a T3 is operating at full or near capacity and another provider with a single T1 is at 5% capacity and all else is the same, it is most likely that the provider with the T1 will be significantly faster. There are many other issues which can affect the results of such a test, these issues will be addressed in detail below. Don't confuse this as endorsement of smaller ISP's with T1. That is far from the case. It is merely an example of what can happen, and why the number we are looking for either is available bandwidth or bandwidth utilization. I would strongly suggest against hosting your services with a firm that is not connected to at least two major backbone providers. Providers that are multi-homed. and correctly setup can actually be more reliable then a single backbone provider, as they have multiple paths to separate networks.
Another important issue to keep in mind is the geographical location of the provider. The Major Network Access Points or NAPS are located in Washington DC, San Jose, CA, Chicago, IL and New York (technically New Jersey). A year ago I would have said it was critical for any major provider to be connected at the NAPs. As of early 1997 I have started to see the NAPs as somewhat less important. MAE East, which according to some estimates carries at least 30% of ALL Internet traffic, has had more problems in the last six months than in the last two years. MFS, which manages the MAEs has little control over how much traffic that the backbone providers connected will carry. There have been major outages and consistent packet loss that can and does affect all major providers. When a provider has multiple connections to major backbone providers they can push their traffic off to the backbones of the upstream providers, instead of through the MAEs.
Local Area Network
The local area network is not often enough being looked as the major source of latency that it is. If a user has a full-time connection to the Internet this can happen at at least two points of the connection. The users local area network and the Internet provider's local area network. A further "LAN" speed issues have recently been cropping up at the MAEs, where all of the Internet providers meet to exchange network traffic. Most of the MAEs utilize DEC Gigaswitches which manages the traffic between providers. These switches, even though they are some of the best technology out there have consistently been overloaded in the past six months. More of the problems that will occur will be on either the providers local area network, or the clients local area network. As providers add dozens or even hundreds of computers to their networks, they need to address the local traffic with the same concern as they manage their Internet routers.
The Webserver hardware and set-up
One of the single biggest points of failure for a website is the actual server hardware. There are many reasons that a server can fail or become slow. The number one reason is that the server is restarted in order for necessary changes to take effect. This is most common on a server that has only recently been set-up, or one that supports a large number of clients. Bad or inefficient CGI and heavy database usage can also bring a fast server to its knees in no time. Therefore, any company that is looking to have a serious Internet presence should get a dedicated server, not a shared server. A dedicated server will insure that if the server is busy, it is busy because your clients or users are accessing the server, not because there are 1,000 kids worldwide trying to download the latest patch to NBA Basketball97 - at your firm's expense I might add. In today's market of web presence providers, it is all to easy to get caught up in a cheap solution that will be slow or unavailable to your users. As you only have a limited window to attract users to your site, a slow or unavailable site is the kiss of death and perhaps the waste of thousands of dollars in development and marketing.
Another common pitfall that exists is that providers greatly over-sell a server's capacity or their own Internet bandwidth. As it is nearly impossible for a provider to know which clients are going to be getting a lot of traffic, and it is equally difficult for them to determine what is too much load on a server. The majority of the lower end of the web hosting market commonly support 300+ clients on a single server. All it would take is the addition of a couple of heavily traffic-ed site to slow the server down to a crawl. Internet providers that supply access to adult orientated sites should be avoided, as these sites are bandwidth and server intensive, not to mention heavily traffic-ed.
Choosing the correct server hardware is also critical. I have had great luck with Sun Microsystems hardware and Sun operating systems. Our firm also supports Microsoft's NT, Apple's MacOS, Linux, and BSD. The latter of these two runs on normal PC hardware. We have not found any of these other offerings to be as reliable or robust as Suns.Suns hardware is built to support an operating system that lives for TCP/IP. Sun's implementation of TCP/IP is by far the most robust of any commercial operating systems. In addition, all current Sun hardware uses SCSI (Small Computer System Interface) to control storage devices. SCSI is faster and more expandable then the IDE drives normally found on PC hardware. Throughput of the drives becomes a critical issue on internet servers that are operating under anything other then minimal load. Apple Macintosh computers, which clearly have dominated the graphic's industry for years,have high performance CPUS and SCSI drives, but an extremely limited TCP/IP implementation. Mac Servers can currently only support a single IP address, without the add of add on products to redirect traffic. Apple has a way to go before they should be considered a serious choice for a high volume internet server.
Graphics and CGI
As a consultant to hundreds of companies planning their Internet presence, I often suggest to people that they not attempt to set-up a site if they are not budgeting the dollars to correctly portray their company's image in a manner consistent with all of their other marketing and sales collateral. I recently worked with a regional bank that spends hundreds of thousands of dollars on design and corporate image. Yet, their first website was designed in a matter of days, with little knowledge of the real goals or technology needed, and with even less concern for the corporate identity that the bank had spent nearly 100 years building. The site that was put up was rushed, slow, ugly, and unprofessional. After a short period of time, this became evident, and a small committee was put together to devise a better site. During the redesign I suggested that a single well designed and webfriendly page be put up, announcing their upcoming "grand opening." The firm's site is now a fine example of banking on the net.
This scenario is being played out time and time again as the corporate world gets the hang of the Internet. Too many sites are going online daily that are either not optimized for the web, or do not clearly indicate to a user how to navigate to quickly find the information needed. I am an advocate for fast sites that are light on graphics, and that either clearly mark a user's path to the information wanted and to easily purchase products online. Remember when designing a website that you should view the site in the same manner as a user. If 95% of your users will be accessing the site with a 28,800 bps modem, then you should do the same, as a minimum several times during the sites development. If you are designing an Intranet that will be used only by a small workgroup on a local area network, then you can feel comfortable viewing your site via your network.
I realize that I have only touched on a couple of the areas that should concern firms that are looking to establish and maintain a fast reliable Internet presence. Future White papers will address some of these issues in much further detail.
Questions or Feedback?