We recently made the following changes in our LAN
1) Purchase an L2 connection between our two sites from the ISP
2) Stretched iSCSI VLAN between the two sites for replication
Previously our ISP provided a VPN service between the two sites. But since we are improving our DR setup we wanted the ability to stretch VLAN between the two sites.
After the changes was performed we have some performance issues in the network. If we ping an IP locally on Site A we usually get <1ms response times some times the response times go up to 2-3 ms. This behaviour we did not see before.
When the Dell Equallogic replication kicks in we see local response times around 15-80 ms, sometimes even over 100ms. This is causing some issues in the network.
I have looked for duplex errors, but can not find any duplex mismatch. All switchports are auto-neg. Spanning-tree seem to be OK.
The iSCSI VLAN is isolated, no routing between iSCSI VLAN and other VLANS/networks. This is the only stretched VLAN at the moment. All other VLANs are routed by PowerConnect 6248 switches on each site.
I find it strange that the replication seem to make the situation worse. Because we were replicating just as much prior to the changes we made. So the bandwidth/utilization should not be higher now compared to before.
I am thinking that maybe something else has changed to causing the issues or something with the stretched VLAN goes terribly wrong?
Does this kind of problem sound familiar to anyone? Anyone who can provide any good troubleshooting advice? I would be happy to supply details about the switch topology and configuration if someone is interested.
sorry, not too many suggestions that i can think of off the top of my head. Our iscsi network is physically seperate currently. Hopefully you're running jumbo frames for the iscsi devices and switches to decrease the overhead of all the iscsi traffic. I'm happy to comment on topology if you want to post it
The increase in response times locally *could* be attributed to the stretched iscsi vlan in the sense that if there is a signifant number of broadcasts on that vlan at either site, those will now (i'm assuming) reach all devices at both sites, whereas before traffic would've been routed and broadcasts dropped....
Please provide more details about this "stretched" VLAN. How are you passing the tagging? Is this a Metro-E circuit, and if so which protocols does your ISP support passing? I'm an EqualLogic user, and replicate to my CoLo facility, but chose a simple routed approach, as my ISP required me to have Cisco gear in order to support VTP. I didn't have any equipment like that, so I just route everything to my CoLo, and things work fine.