What is Server and Why does a Server Crash?

FREE-SKY (HK) ELECTRONICS CO.,LIMITED / 10-21 09:35

The server provides computing or application services for other clients (such as PCs, smartphones, ATMs, and other terminals or even large equipment such as train systems) in the network.

What is a server?

A server is a type of computer. It runs faster, has a higher load, and is more expensive than an ordinary computer. The server provides computing or application services for other clients (such as PCs, smartphones, ATMs, and other terminals or even large equipment such as train systems) in the network. The server has high-speed CPU computing power, long-term reliable operation, powerful I/O external data throughput, and better scalability.

According to the services provided by the server, in general, the server has the ability to respond to service requests, undertake services, and guarantee services. As an electronic device, a server has a very complicated internal structure, but it is not much different from the internal structure of an ordinary computer, such as: cpu, hard disk, memory, system, system bus, etc.

Server VS. PC

Personal computers are called PCs, and PC problems have a limited scope of influence. A computer that serves many people is called a server. If the server has a problem, the scope of influence is relatively large.

Different from PCs that serve individuals, servers are mainly used to provide high-performance services to hundreds of millions of people. Servers have relatively high requirements for processing capabilities, management capabilities, I/O performance, availability, reliability, and scalability.

A server is composed of many components, including CPU, memory, hard disk, fan, power supply, and so on. Although the server and the PC adopt a similar architecture, the specifications and professionalism of each component of the server are higher.

As can be seen from the table below, we can see the gap between the server and the PC.

Components	PC	Server (take R5300G4X as an example)
Chassis	different shapes, generally vertical	standard flat structure, with a height U (1U is approximately equal to 44 mm) as the unit, such as a 2U server. A rack can contain multiple servers.
Motherboard	Ordinary household	Professional design
CPU	1	Support 2 Intel® Xeon® third-generation scalable processors (Ice Lake) Single processor up to 40
Memory	16 GB, 32 GB	Supports up to 32 DDR4 Memory speed up to 3200MT/s
Hard Disk	1 or 2 mechanical hard disks, SSD hard disks	Provide a maximum of 41 2.5" disk bays or 20 3.5" disk bays + 4 2.5" disk bays Supports up to 28 NVMe SSDs
I/O module	-	Supports up to 14 PCIe4.0 expansion slots
Power supply	1	1+1 hot-swappable redundant power supply
Fan	1 or 2	4 groups of high-efficiency fans, N+1 redundancy, intelligently adjusted cooling system
Network card	100M or 1000M	2 GE electrical ports 1 GE = 1000M

Servers are generally deployed in major companies and units, such as major operators, financial companies, gaming companies, and Internet companies. A data center composed of hundreds of servers can store massive amounts of data and provide network services, financial services, or data services to hundreds of millions of people.

What happens if the server crashes?

Once the server crashes, there may be problems with the various services built into the server. And because the server serves a large number of people, the impact will be wider and the consequences will be more serious.

The impact of a server crash.

The impact of a server crash

Video website: There is a large-scale inaccessibility failure, online video cannot be watched. In case the data in the server is lost and the original videos and animations of many authors cannot be restored, it would be a disaster.

Financial system: A financial system with tens of millions of transactions per second requires a rock-solid server. After all, it has affected everyone's capital exchanges, and the loss is incalculable.

Competitive games: The online players of popular competitive games may exceed tens of millions of levels. Asking what is the experience of tens of millions of people being disconnected at the same time, there will definitely not be a good answer.

Why does the server crash?

Some reasons why the server cannot continue to provide services may be:

The contradiction between the fast-growing number of users and the server performance cannot grow at the same time.

The impact of user growth on servers.

The impact of user growth on servers

Although the data center contains hundreds of servers, it can provide services to a large number of users. When the number of users grows too fast, the performance of the existing number of servers will also reach the limit. Increasing productivity will not happen overnight, resulting in productivity falling short of domestic demand. Building a brand new data center requires a lot of time in traditional methods.

Exhaustion of resources (CPU, memory, hard disk) caused by a large number of service requests

CPU, memory, and hard disk capacity are exhausted.

CPU, memory, and hard disk capacity are exhausted

In the Internet age, hundreds of millions of service requests are submitted to the server for processing, forcing the server to run at full capacity: CPU computing to fever, memory loading data to spinning, and hard disk space close to 100%.

Mass visits in a short period of time have an impact on the performance of the server

Mass visits in a short period of time.

Mass visits in a short period of time

Internet-specific phenomena such as major social events and unexpected hot topics will cause a large number of users to flood into a certain website or APP in a short period of time, causing waves of impact on the processing performance of the server like a sudden tsunami. No matter how large a data center is, servers will be overwhelmed by the enthusiasm of the people, resulting in the temporary inaccessibility of websites and apps.

Unexpected reason

Schrodinger's reason.

Schrodinger's reason

What causes the server to crash may also be something that no one can think of. For example an extra line of code, optical cables dug in road construction, DDoS attacks from unknown sources, Trojan horse implantation by hackers, internal server errors (5XX status codes, see for details, besides 404, what other "secret codes" are on the web page?" ). Just like Schrodinger's cat, no one can confirm which cat caused the trouble until the moment the box is opened.

Characteristics of servers

We can measure whether the server has achieved its design purpose from these aspects; R: Reliability; A: Availability; S: Scalability; U: Usability; M: Manageability, namely The RASUM metric of the server.

1. Scalability

The server must have a certain degree of "scalability". This is because the corporate network cannot last forever, especially in today's information age. If the server does not have a certain degree of scalability, and when the number of users increases, it will be incompetent. A server worth tens of thousands or even hundreds of thousands will be eliminated in a short period of time, which is unbearable for any enterprise. In order to maintain scalability, it is usually necessary to have a certain amount of expandable space and redundancy on the server (such as disk array rack positions, PCI and memory slot positions, etc.).

Scalability is specifically embodied in whether the hard disk can be expanded, whether the CPU can be upgraded or expanded, whether the system supports multiple optional mainstream operating systems such as Windows NT, Linux, or UNIX, etc. Only in this way can the initial investment be fully utilized in the later stage.

2. Availability

The function of the server is much more complicated than that of the PC. It not only refers to its hardware configuration but also refers to its software system configuration. For servers to achieve so many functions, it is unimaginable without comprehensive software support. However, if there are too many software systems, the performance of the server may be degraded, and management personnel cannot effectively manipulate it. Therefore, when designing servers, many server manufacturers must fully consider the availability and stability of the server, and must also work hard on the ease of use of the server.

The availability of the server is mainly reflected in whether the server is easy to operate, whether the user navigation system is complete, whether the chassis design is humanized, whether there is a key recovery function, whether there is an operating system backup, and whether there is sufficient training support and so on.

3. Usability

For a server, a very important aspect is its "usability", that is, the selected server can meet the requirements of long-term stable work without frequent problems.

Because the server is facing users of the entire network, rather than a single user. In large and medium-sized enterprises, the server is usually required to be uninterrupted. In some special application areas, some servers have to work uninterrupted even if they are not used by users, because they must continuously provide users with connection services, regardless of whether it is on or off work, whether it is a working day, a break, or a holiday. This is the fundamental reason why the server must have extremely high stability.

Generally speaking, specialized servers must work uninterrupted 7X24 hours, especially like some large-scale network servers, such as servers used by large companies, web servers, and iqdeWEB servers that provide public services. For these servers, there may only be one number of real work startups, that is, the time it was put into official use after it was purchased and fully installed and configured. After that, it worked uninterrupted until it was completely scrapped. If something goes wrong at every turn, the network cannot maintain normal operation for a long time. In order to ensure that the server has a high "availability", in addition to requiring the quality of all accessories, necessary technical and configuration measures, such as hardware redundancy, online diagnosis, etc., can also be taken.

4. Manageability

Among the main features of the server, there is another important feature, that is, the "manageability" of the server. Although we say that the server needs to work continuously without interruption, no matter how good the product is, it is likely to fail. Although the server has sufficient guarantee in terms of stability, it should also have necessary measures to avoid errors, find problems in time, and can be maintained in time if it fails. This not only reduces the chance of server errors but also greatly improves the efficiency of server maintenance. In fact, it is the serviceability proposed by Sun.

The manageability of the server is also reflected in whether the server has an intelligent management system, whether it has an automatic alarm function, whether it has an independent and systematic management system, whether it has an LCD monitor, etc. Only in this way, the administrator can manage easily and work efficiently.