Datacenter Structure
Introduction
The datacenter provides the capability to host my own servers at an affordable cost in long-term run. Those servers, including high performance computing nodes, high capacity storage servers, cold data storage system (tape), general servers, and hypervisors.
This part has been divided into three categories: Domains, Bare Metal & Virtual Machines, and Inter-Connection
Active Directory Domains
Currently, the network was primarily separated into two domains: ad1 and ad2.
 
                        ad1
ad1 including all servers that are dedicated to genetic research projects, it also safeguards the core services within the entire network like the CA policy server or hypervisors. This is real production environment and has it own set up proxies & firewalls. Depending on the project requirment, servers and virtual machiens can be included or excluded into ad1 dynamically.
Currently, ad1 has the following servers:
| Model | CPU | Amount | Purpose | Type | 
|---|---|---|---|---|
| DELL R630 | 2 * Intel Xeon E5-2640 V4 20C40T @2.4G | 1 | File preprocessing & downloading | Bare-Metal | 
| DELL R630 | 2 * Intel Xeon E5-2640 V4 20C40T @2.4G | 1 | Result compressing & tape writing | Bare-Metal | 
| DELL R630 | 2 * Intel Xeon E5-2640 V4 20C40T @2.4G | 1 | Genetic analysis | Bare-Metal | 
| DELL R740 | 2 * Intel Xeon Platinum 8269CY 52C104T @2.5G | 2 | Genetic analysis | Bare-Metal | 
| DELL R740 | 2 * Intel Xeon Platinum 8269CY 52C104T @2.5G | 1 | Hypervisor | Bare-Metal | 
| Supermicro X10QBL | 4 * Intel Xeon E7-8892V2 60C120T @2.8G | 1 | Hypervisor | Bare-Metal | 
| HP DL380 G9 | 2 * Intel Xeon E5-2650 V4 24C48T @2.2G | 3 | Storage node | Bare-Metal | 
| VM | 4 Core | 2 | Domain Controller | Virtual Machine | 
| VM | 4 Core | 1 | Cloud Drive | Virtual Machine | 
| VM | ... | ... | More VM omitted due to security reasons. | Virtual Machine | 
For details of how these servers works in the project, please check out at the Genetic Analysis Project
ad2
ad2 includes all servers that are for general, personal, or experimental use. It contains all "less-critical" services and is usually used as an "experimental field" for ad1. Before a new group policy was deployed to ad1, it was deployed to ad2 first. Once the new technology was proved to be stable, it can then be deployed to ad1.
Currently, ad2 has the following servers:
| Model | CPU | Amount | Purpose | Type | 
|---|---|---|---|---|
| HP DL380 G9 | 2 * Intel Xeon E5-2650 V4 24C48T @2.2G | 1 | Storage node | Bare-Metal | 
| VM | 4 Core | 2 | Domain Controller | Virtual Machine | 
| VM | 4 Core | 1 | ASL Cloud Drive | Virtual Machine | 
| VM | 2 Core | 1 | Web Primary Proxy | Virtual Machine | 
| VM | 2 Core | 1 | Web forum | Virtual Machine | 
| VM | 2 Core | 1 | Web static resources | Virtual Machine | 
| VM | 2 Core | 1 | Web analytics | Virtual Machine | 
| VM | 2 Core | 1 | Web mail | Virtual Machine | 
| VM | ... | ... | More VM omitted due to security reasons. | Virtual Machine | 
Bare-Metal Servers and Virtual Machines
Introduction
A mix of bare-metal servers and virtual machiens were used to fulfill our needs. For scientific calculation that require absolute high performance, bare-metal server(s) were assigned to maximize the efficiency. For needs like internal proxies or temperory experimental environment, a bare-metal server is obviously overkill. For such purpose, we create virtual machines for each application or requirement.
Bare-Metal Servers
Bare-metal servers are physical servers that has generic OS installed to provide (mostly) a sole purpose. This is mostly seen in the genetic research project where an intensive amount of data needs to be downloaded, analyzed, and compressed.
Whenever a purpose of a bare-metal server changed, we will re-install the OS via the management interface remotely, and reconfigure it for its new task.
 
                        Virtual Machines
Virtual Machiens are much more flexible. Although all the hypervisors were governored under ad1, the virtual machines hosted are not.
Currently, we create virtual machines whenever that is a new application needs to be deployed. This includes a new website, a new experimental environment, or simply a new service like database or email.
When there is enough resources, a virtual machine is more flexible than a docker container. Whenever the application was taken out of service, the virtual machine hosting the service will be deleted from the system.
 
                        Virtual Machine Overhead
Unlike the docker container, each VM has its own complete OS. This creates overhead when the same OS was installed on multiple virtual machines. Fortunately, a common Linux distribution like Debian or Ubuntu can be installed on a 10GB disk without problem. In most cases, 10GB is enough for the base system and newly insalled software & dependencies. For the application data that is much larger, a remote file system will be mounted. That is, all the VM's shares the space of storage nodes. This has greatly minimize the waste of disk space. Since each hypervisor was equipped with TBs of SSD storage, this enables each hypervisor to host up to hundreds of virtual machines.
 
                    Interconnection
Network
All the servers were connected in a full-fiber 40G Ethernet. For storage nodes, RDMA(RoCE) were configured to avoid performance reduction caused by the CPU performance. For the all the physical servers, they were plug in to the core 40G switch directly with MPO fiber.
For management interfaces or all other downstream equipment, they were pluged into a H3C switch that has a total of 40G uplink(bonded by 4 x 10G SFP port). This ensures that the network will always be able to perform at its maximum performance for any equipment at any time.
 
                        VLAN Isolation
Different VLAN was created and assigned to different purpose and for each machine. This ensures the safety of most servers. However, the VLAN configuration is not complete yet, and more information will be available once the project was complete
Gateway
There are a total of 3 firewall zones, each corresponds to a different route to the internet. This refers to the 3 primary access line of the datacenter. Each route has its own access policy and they served as backups for each other. Whenever a service was setup, it will be assigned an access policy to use all or any of the routes.
 
                        Typically the application that has intensive traffic (like web drive) will use a different route than application that are less intensive but has more controlling data (like a service monitor or console). This ensures the stableness of most critital service by avoiding packet loss at the gateway.
The Future Work
Currently, there are two focus for the datacenter.
- Improve the network security. Hybrid Storage Project
- Improve the efficiency of storage for both SSDs and HDDs: GPFS Storage Project
These plans, however, require further study on their potential impact on the structure and performance, and thus were assigned to a different project.