Facebook WDT Test

March 17, 2017 Architect, backup, hardware, IDC, network, system No comments

We test Facebook WDT tool:https://github.com/facebook/wdt

Read this PDD: WDT-TEST


June 29, 2016 Architect, Architecture, hardware, MYSQL, network No comments

we use tcpcopy to make real traffic on our core systems. Many problems will be found in advance if we enlarge queries several times.


Infiniflash Benchmark

May 29, 2016 Architect, Architecture, hardware, MYSQL, performance No comments

Sandisk (FusionIO) and Nexenta are working together to build this SDS solution.

Infiniflash is a very large SDS production, which manages for very large DW system who requires large storage space and also high IOPS.

We test infiniflash system ,read this Infiniflash_benchmark

MySQL distributed message system

April 12, 2016 Architect, Architecture, hardware, MYSQL, replication, software No comments

Based on messages, we create mysql replication platforms , using async message to build strong distributed subscription system.

read this PDF : http://www.vmcd.org/docs/MySQL_async_message.pdf

TokuDB benchmark on PCIe

October 21, 2015 Architect, Architecture, hardware, Internals, MYSQL, performance No comments

MariaDB TokuDB benchmark on FusionIO ,Compare TokuDB and InnoDB engines.

read: TokuDB_benchmark

NVMFS Supports Atomic Writes

September 6, 2015 Architect, hardware, Internals, MYSQL No comments

Benchmark for NVMFS (supports atomic writes ,so we can close double-write option on specific MySQL version )

tips: some Flash-based cards could support large block map , the main idea is to avoid fractured page writes.


关于Dell 推出第13代服务器的一些想法

September 12, 2014 Architect, hardware No comments

戴尔近日推出了旗下的13G服务器,其主力机型为R730xd,包含了诸多的特性,为其成为主流db server以及规模存储集群打下了良好的基础。




1.CPU 为intel haswall最新架构,减少了功能的损耗
2.更多的插槽,扩展为可支持18块1.8寸SSD的槽位 以及多种磁盘混插的模式
3.DDR4 memory 拥有更高的主频
6.基于iDRAC8的自动管理功能 包括服务器性能的监控,邮件报警(app端)等等
7.Sandisk的缓存技术取代之前的LSI的(是否与LSI被希捷收购有关 ?)
8.增强的新一代的RAID卡 更大的内存以及基于RAID卡的直接系统日志收集等(依然采用电池)
10.NVMe协议的支持 (支持 NVMe_SSD 全面拥抱Intel ?)


根据戴尔sales的描述,R730xd为下一代db-server,hadoop server 以及云计算server.在这里针对hadoop server持保留意见,其18块ssd的插槽扩展虽然增加了ssd的整体容量,但对于hadoop这类应用,或者对于目前hadoop的软件架构,SSD是否能发挥其应有的性能,facebook的测试给出了答案。


Also, a SSD device can support 100K to 200K operations/sec while a spinning disk controller can possibly 
issue only 200 to 300 ops/sec. This means that random reads/writes are not a bottleneck on SSDs. 
On the other hand, most of our existing database technology is designed to store data in spinning disks, 
so the natural question is "can these databases harness the full potential of the SSDs"?



结论为现在HADOOP/hbase 并不能将SSD的性能优势发挥的玲离尽致 hadoop修改代码后的瓶颈依然存在(JAVA DFSClient),hbase线程锁导致cpu利用率低下,这归根于传统的数据库基于机械硬盘IO的设计,不过这一点在oracle上解决的非常好(oracle 在unix/linux是基于进程的数据库)。

最后如Dhruba Borth所说

@Sujoy: you are absolutely right. In fact, we currently run multiple servers instances per SSD 
just to be able to utilize all the IOPs. This is kindof-a-poor man's solution to the problem. 
Also, you have to have enough CPU power on the server to be able to drive multiple database 
instances on the same machine.

Facebook通过多实例并用server来以最小的成本达到硬件的最大性能,这类似于早期的mysql,mysql的多线程架构并不能在SMP NUMA架构的机器中充分利用CPU的能力,所以衍生出了NUMA多实例,多种绑定CPU的策略。所以在传统的数据库架构下要契合最新的硬件并不是一件很轻松的事。

另外针对线程以及进程(在unix时代对线程支持不是非常好,所以如oracle pg等数据库采用了进程的方式,mysql采用线程在早期对CPU的利用也是十分低下的) 可以暂且认为线程是近代DB的一种趋势(不知道准不准确)因为线程本省对于进程来说是具有一定优势的(内存的共享 以及更小的创建代价,更低的CPU上下文切换代价)

[SHOUG] 1号店Exadata应用

April 3, 2014 Architect, Exadata, hardware, oracle No comments

Download PDF: [SHOUG.LOUISLIU]Exadata在电商的实践

FlashCard Test Detail

March 5, 2014 Architect, hardware No comments

Compare between FIO LSI and VD
Upload on 2014/3/5 by louis liu

Download this PDF

IBM will support Flash In DIMM using MCS

January 17, 2014 Architect, hardware, system, unix No comments

The next generation of IBM’s X-series servers will be able to accommodate solid-state Flash drives clipped into their DIMM memory slots, potentially improving the response times of fast-paced enterprise applications.

On Thursday, IBM unveiled the Series 6 generation of its System X x86-based servers. In addition to the novel reuse of DIMM slots, the X6 architecture will also let customers upgrade them to a new generation of processors or memory without swapping in a new motherboard.


Diablo Technologies, a memory technology company, developed Memory Channel Storage (MCS) that 
enables flash on a DIMM module to be accessed by the CPU, instead of using the SATA bus as other 
DIMM form-factor SSD products have. Using a host-level driver and an ASIC on the DIMM, it creates 
a special memory storage layer in flash through which the CPU actually moves data from the RAM 
memory space. It also requires a minor modification in the server BIOS to be supported by the CPU, something 
that three OEMs have currently completed. 

Each DIMM flash module has 16 separate data channels, which are independently addressable. 
This enables parallel data writes by the driver, improving performance over the DMA process 
used by PCIe-based solutions. Specs for ULLtraDIMM are showing 5 microsecond latencies for these 
devices, an order of magnitude better than typical PCIe flash products. This architecture also 
enables up to 63 ULLtraDIMM modules to be aggregated creating 25TB of flash capacity and >9M IOPS in a single server.


Ref: IBM X series servers now pack Flash into speedy DIMM slots
IBM Beefs Up Enterprise X-Architecture With Flash, Modular Design
Heating Up Storage Performance
How to Make Flash Accessible on the Memory Bus
Memory Channel Storage™
ULLtraDIMM: combining SSD and DRAM for the enterprise