Posted by Fred Zhang
At our SNIA Ethernet Storage Forum (ESF) webcast “Re-Introduction to Ethernet Networked Storage,” we provided a solid foundation on Ethernet networked storage, the move to higher speeds, challenges, use cases and benefits. Here are answers to the questions we received during the live event.
Q. Within the iWARP protocol there is a layer called MPA (Marker PDU Aligned Framing for TCP) inserted for storage applications. What is the point of this protocol?
A. MPA is an adaptation layer between the iWARP Direct Data Placement Protocol and TCP/IP. It provides framing and CRC protection for Protocol Data Units. MPA enables packing of multiple small RDMA messages into a single Ethernet frame. It also enables an iWARP NIC to place frames received out-of-order (instead of dropping them), which can be beneficial on best-effort networks. More detail can be found in IETF RFC 5044 and IETF RFC 5041.
Q. What is the API for RDMA network IPC?
The general API for RDMA is called verbs. The OpenFabrics Verbs Working Group oversees the development of verbs definition and functionality in the OpenFabrics Software (OFS) code. You can find the training content from OpenFabrics Alliance here. General information about RDMA for Ethernet (RoCE) is available at the InfiniBand Trade Association website. Information about Internet Wide Area RDMA Protocol (iWARP) can be found at IETF: RFC 5040, RFC 5041, RFC 5042, RFC 5043, RFC 5044.
Q. RDMA requires TCP/IP (iWARP), InfiniBand, or RoCE to operate on with respect to NVMe over Fabrics. Therefore, what are the advantages of disadvantages of iWARP vs. RoCE?
A. Both RoCE and iWARP support RDMA over Ethernet. iWARP uses TCP/IP while RoCE uses UDP/IP. Debating which one is better is beyond the scope of this webcast, but you can learn more by watching the SNIA ESF webcast, “How Ethernet RDMA Protocols iWARP and RoCE Support NVMe over Fabrics.”
Q. 100Gb Ethernet Optical Data Center solution?
A. 100Gb Ethernet optical interconnect products were first available around 2011 or 2012 in a 10x10Gb/s design (100GBASE-CR10 for copper, 100GBASE-SR10 for optical) which required thick cables and a CXP and a CFP MSA housing. These were generally used only for switch-to-switch links. Starting in late 2015, the more compact 4x25Gb/s design (using the QSFP28 form factor) became available in copper (DAC), optical cabling (AOC), and transceivers (100GBASE-SR4, 100GBASE-LR4, 100GBASE-PSM4, etc.). The optical transceivers allow 100GbE connectivity up to 100m, or 2km and 10km distances, depending on the type of transceiver and fiber used.
Q. Where is FCoE being used today?
A. FCoE is primarily used in blade server deployments where there could be contention for PCI slots and only one built-in NIC. These NICs typically support FCoE at 10Gb/s speeds, passing both FC and Ethernet traffic via connect to a Top-of-Rack FCoE switch which parses traffic to the respective fabrics (FC and Ethernet). However, it has not gained much acceptance outside of the blade server use case.
Q. Why did iSCSI start out mostly in lower-cost SAN markets?
A. When it first debuted, iSCSI packets were processed by software initiators which consumed CPU cycles and showed higher latency than Fibre Channel. Achieving high performance with iSCSI required expensive NICs with iSCSI hardware acceleration, and iSCSI networks were typically limited to 100Mb/s or 1Gb/s while Fibre Channel was running at 4Gb/s. Fibre Channel is also a lossless protocol, while TCP/IP is lossey, which caused concerns for storage administrators. Now however, iSCSI can run on 25, 40, 50 or 100Gb/s Ethernet with various types of TCP/IP acceleration or RDMA offloads available on the NICs.
Q. What are some of the differences between iSCSI and FCoE?
A. iSCSI runs SCSI protocol commands over TCP/IP (except iSER which is iSCSI over RDMA) while FCoE runs Fibre Channel protocol over Ethernet. iSCSI can run over layer 2 and 3 networks while FCoE is Layer 2 only. FCoE requires a lossless network, typically implemented using DCB (Data Center Bridging) Ethernet and specialized switches.
Q. You pointed out that at least twice that people incorrectly predicted the end of Fibre Channel, but it didn’t happen. What makes you say Fibre Channel is actually going to decline this time?
A. Several things are different this time. First, Ethernet is now much faster than Fibre Channel instead of the other way around. Second, Ethernet networks now support lossless and RDMA options that were not previously available. Third, several new solutions–like big data, hyper-converged infrastructure, object storage, most scale-out storage, and most clustered file systems–do not support Fibre Channel. Fourth, none of the hyper-scale cloud implementations use Fibre Channel and most private and public cloud architects do not want a separate Fibre Channel network–they want one converged network, which is usually Ethernet.
Q. Which storage protocols support RDMA over Ethernet?
A. The Ethernet RDMA options for storage protocols are iSER (iSCSI Extensions for RDMA), SMB Direct, NVMe over Fabrics, and NFS over RDMA. There are also storage solutions that use proprietary protocols supporting RDMA over Ethernet.