Schlagwort-Archive: freebsd vimage if_bridge epair

FreeBSD 10: VIMAGE (virtualized network stack) mit if_bridge & epair

VIMAGE ermöglicht die Virtualisierung des Network Stacks für eine Jail, dadurch wird das Interface vom HOST entkoppelt und in die JAIL Umgebung verschoben.

Mit Hilfe einer Bridge (if_bridge) und eines virtuellen Interface (back-to-back) Paares (epair) kann ein transparentes Routing realisiert werden.

Die einfachste Form der Netzwerkvirtualisierung kann mit VIMAGE und if_bridge/epair bewerkstelligt werden. Eine weitaus komplexere Lösung ist VIMAGE in Kombination mit netgraph (siehe) (änhlich wie Solaris Crossbow) … jedoch dazu mehr in einem anderen Blogartikel

Beispiel:

freebert_vimage

WARNING: VIMAGE (virtualized network stack) is a highly experimental feature.

Einschränkung 1: epairs lassen sich derzeit nicht sauber mit lagg (Link Aggregation and Failover) Interfaces betreiben:

$
bridge0: Ethernet address: ff:ff:ff:ff:ff:ff
epair1a: Ethernet address: ff:ff:ff:ff:ff:ff
epair1b: Ethernet address: ff:ff:ff:ff:ff:ff
epair1a: link state changed to UP
epair1b: link state changed to UP
bridge0: error setting interface capabilities on lagg0
epair1a: promiscuous mode enabled
ng_ether_ifnet_arrival_event: can't re-name node epair1b
$

Einschränkung 2: VIMAGE führt in Kombination mit PF zum Kernelcrash! (stattdessen sollte auf IPFIREWALL oder IPFilter (nicht getestet) zurück gegriffen werden)

$
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff81c4ff79
stack pointer           = 0x28:0xfffffe01222e4600
frame pointer           = 0x28:0xfffffe01222e4630
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1901 (jail)
trap number             = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff80911180 at kdb_backtrace+0x60
#1 0xffffffff808d8745 at panic+0x155
#2 0xffffffff80d12ae2 at trap_fatal+0x3a2
#3 0xffffffff80d12db9 at trap_pfault+0x2c9
#4 0xffffffff80d12546 at trap+0x5e6
#5 0xffffffff80cf97e2 at calltrap+0x8
#6 0xffffffff81c4fbf8 at pf_altq_ifnet_event+0x48
#7 0xffffffff81c4bce1 at pfi_attach_ifnet_event+0xd1
#8 0xffffffff8099a843 at if_attach_internal+0x463
#9 0xffffffff809b0eca at lo_clone_create+0x9a
#10 0xffffffff809a6445 at if_clone_createif+0xb5
#11 0xffffffff809a6f2e at if_clone_simple+0xbe
#12 0xffffffff809b0e03 at vnet_loif_init+0x23
#13 0xffffffff809c1ea7 at vnet_sysinit+0x77
#14 0xffffffff809c1cbf at vnet_alloc+0xdf
#15 0xffffffff808a9c80 at kern_jail_set+0x1af0
#16 0xffffffff808abb81 at sys_jail_set+0x41
#17 0xffffffff80d133d7 at amd64_syscall+0x357
Uptime: 6m9s
Dumping 315 out of 4060 MB:..6%..11%..21%..31%..41%..51%..61%..72%..82%..92%
$

Einschränkung 3: beim Einsatz von mehreren FreeBSD VIMAGE Jail-Servern auf Duplikate der epair MAC-Adressen achten:

$
epair1a: Ethernet address: ff:ff:ff:ff:ff:ff
epair1b: Ethernet address: ff:ff:ff:ff:ff:ff
epair1a: link state changed to UP
epair1b: link state changed to UP
epair1a: promiscuous mode enabled
ng_ether_ifnet_arrival_event: can't re-name node epair1b
epair1b: DAD detected duplicate IPv6 address fe80:ffff:ffff:ffff:ffff:ffff:ffff: NS in/out=0/1, NA in=1
epair1b: DAD complete for fe80:ffff:ffff:ffff:ffff:ffff:ffff - duplicate found
epair1b: manual intervention required
epair1b: possible hardware address duplication detected, disable IPv6
$

Patch: Temporary epair MAC address fix

Der Patch ist unvollständig, damit epairNa auch eine bessere random MAC-Adresse bekommt, sollte es wie folgt aussehen (Zeile 64,725 und 821):

$
64 #include <sys/libkern.h>

725 eaddr[1] = arc4random() & 0xff;

821 eaddr[1] = arc4random() & 0xff;
$

siehe: GitHub – plitc / freebsd / Temporary epair MAC address fix

– Ohne diesen Patch kommt es sehr schnell zu einem VIMAGE Crash mit mehreren FreeBSD Servern, im gleichen Netzwerk –

Benötigt wird:
– VIMAGE Support im Kernel
– if_bridge/epair sowie netgraph/netgraph_ether
– deaktiviertes PF

FreeBSD Beastie Installation: VIMAGE

Punkt 1: zunächst wird mit den aktuellen Kernelsourcen der VIMAGE Support hinzugefügt

$
cd /usr/ports/devel/subversion/ && make install clean

zfs create -o checksum=sha256 -o compression=lz4 -o mountpoint=/usr/src zroot/usr/src
zfs create -o checksum=sha256 -o compression=lz4 -o mountpoint=/usr/obj zroot/usr/obj

cd /usr
chflags -R noschg /usr/obj/*
rm -rfv /usr/obj/*
rm -rfv /usr/src/*
rm -rfv /usr/src/.svn

cd /usr/src
svn checkout https://svn0.eu.FreeBSD.org/base/releng/10.0 /usr/src
svn up /usr/src
$

Punkt 2: Kernel Konfig

$
cd /usr/src/sys/amd64/conf
mkdir /root/kernels
cp GENERIC /root/kernels/VIMAGE
ln -s /root/kernels/VIMAGE
vi /root/kernels/VIMAGE
$
$
### ### ### VIMAGE ### ### ###
#
cpu             HAMMER
ident           VIMAGE

makeoptions     DEBUG=-g                        # Build kernel with gdb(1) debug symbols
makeoptions     WITH_CTF=1                      # Run ctfconvert(1) for DTrace support

### < --- --- --- > ###

options         IPFIREWALL                      # enables IPFW
options         IPFIREWALL_VERBOSE              # enables logging for rules with log keyword
options         IPFIREWALL_VERBOSE_LIMIT=5      # limits number of logged packets per-entry
options         IPFIREWALL_DEFAULT_TO_ACCEPT    # sets default policy to pass what is not explicitly denied
options         IPDIVERT                        # enables NAT

###options         DUMMYNET                        # traffic shaper, bandwidth manager and delay emulator
###options         HZ=1000                         # strongly recommended

device          carp
device          lagg
device          enc
device          gre
options         XBONEHACK

options         TCP_SIGNATURE                   # include support for RFC 2385

options         VIMAGE                          # Network Stack Virtualization
options         NULLFS                          # NULL filesystem

### VIMAGE - if_bridge/epair virtualization // ###
device          if_bridge
device          epair
### // VIMAGE - if_bridge/epair virtualization ###

### VIMAGE - netgraph virtualization // ###
options         NETGRAPH
options         NETGRAPH_ETHER
options         NETGRAPH_BRIDGE
options         NETGRAPH_EIFACE
options         NETGRAPH_SOCKET
### // VIMAGE - netgraph virtualization ###

device          tap                             # virtual link layer 2 device

options         VFS_AIO

### DEFAULT ### options         TCP_OFFLOAD     # TCP offload

options         RACCT                           # Resource accounting
options         RCTL                            # Controls resource limits

device          crypto                          # core crypto support
device          cryptodev                       # /dev/crypto for access to h/w

device          rndtest                         # FIPS 140-2 entropy tester

device          hifn                            # Hifn 7951, 7781, etc.
options         HIFN_DEBUG                      # enable debugging support: hw.hifn.debug
options         HIFN_RNDTEST                    # enable rndtest support

device          ubsec                           # Broadcom 5501, 5601, 58xx
options         UBSEC_DEBUG                     # enable debugging support: hw.ubsec.debug
options         UBSEC_RNDTEST                   # enable rndtest support

options         IPSEC                           # IP security (requires device crypto)
options         IPSEC_NAT_T                     # NAT-T support, UDP encap of ESP

options         FDESCFS                         # File descriptor filesystem

### NOT WITH VIMAGE ### device          pf
### NOT WITH VIMAGE ### device          pflog
### NOT WITH VIMAGE ### device          pfsync
### NOT WITH VIMAGE ### options         ALTQ
### NOT WITH VIMAGE ### options         KTR_ALQ
### NOT WITH VIMAGE ### options         ALTQ_CBQ       # Class Based Queueing
### NOT WITH VIMAGE ### options         ALTQ_RED       # Random Early Detection
### NOT WITH VIMAGE ### options         ALTQ_RIO       # RED In/Out
### NOT WITH VIMAGE ### options         ALTQ_HFSC      # Hierarchical Packet Scheduler
### NOT WITH VIMAGE ### options         ALTQ_CDNR      # Traffic conditioner
### NOT WITH VIMAGE ### options         ALTQ_PRIQ      # Priority Queueing
### NOT WITH VIMAGE ### options         ALTQ_NOPCC     # Required if the TSC is unusable
### NOT WITH VIMAGE ### options         ROUTETABLES=15 # max 16 FIB (Forward Information Base/multiple routing tables) support
#
### NOT WITH VIMAGE ### options         SCTP           # Stream Control Transmission Protocol
#
### ### ### VIMAGE ### ### ###
$

Punkt 3: Kernel bauen/installieren

$
cd /usr/src
time make buildkernel KERNCONF=VIMAGE
time make installkernel KERNCONF=VIMAGE

reboot
$

Punkt 4: rc.conf anpassen
(bge0 soll in diesem Beispiel das physikalische Interface sein)

$
vi /etc/rc.conf

### VIMAGE // ###
cloned_interfaces="bridge0"
ifconfig_bridge0_name="vswitch0"
ifconfig_vswitch0="addm bge0"
### // VIMAGE ###

### EZJAIL // ###
ezjail_enable="YES"
jail_parameters="vnet=new"
### // EZJAIL ###
$

Punkt 5: sysctl.conf anpassen

$
vi /etc/sysctl.conf

### EZJAIL // ###
security.jail.allow_raw_sockets=1
security.jail.param.allow.raw_sockets=1
#
#net.add_addr_allfibs=4
### // EZJAIL ###
$

Punkt 6: ezjail installieren

$
cd /usr/ports/sysutils/ezjail/ && make install clean

vi /usr/local/etc/ezjail.conf

### ### ### EZJAIL ### ### ###
# ezjail_sourcetree=/usr/src
 
ezjail_use_zfs="YES"
ezjail_use_zfs_for_jails="YES"
ezjail_jailzfs="zroot/ezjail"
 
ezjail_zfs_properties="-o checksum=fletcher4 -o compression=lz4 -o atime=off"
### ### ### EZJAIL ### ### ###
# EOF

ezjail-admin install
ezjail-admin update -P
$

Punkt 7: Jail erstellen

$
ezjail-admin create test01 0.0.0.0
$

Punkt 8: Jail Konfig anpassen
(die fest definierte IP-Adresse wird auskommentiert)

$
vi /usr/local/etc/ezjail/test01

export jail_test01_exec_stop="/bin/sh /etc/rc.shutdown"
export jail_test01_parameters="allow.raw_sockets=1 allow.sysvipc=1"
#export jail_test01_ip="0.0.0.0"
export jail_test01_exec_prestart0="ifconfig epair1 create up"
export jail_test01_exec_prestart1="ifconfig vswitch0 addm epair1a"
export jail_test01_exec_poststart0="ifconfig epair1b vnet test01"
export jail_test01_exec_poststart1="jexec test01 /sbin/ifconfig epair1b 192.168.0.101/24"
export jail_test01_exec_poststart2="jexec test01 /sbin/route add default 192.168.0.1"
export jail_test01_exec_poststop0="ifconfig epair1a destroy"
$

Punkt 9: ezjail-admin listet zukünftig ohne IP-Adresse

$
# ezjail-admin list              
STA JID  IP              Hostname                       Root Directory
--- ---- --------------- ------------------------------ ------------------------
ZS  N/A  -               test01                         /usr/jails/test01
[root@test:~]#
$

Punkt 10: die IP-Adressbindung ist nur noch in der Jail sichtbar:

@HOST

$ ifconfig
vswitch0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether ff:ff:ff:ff:ff:ff
        nd6 options=1<PERFORMNUD>
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: epair1a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 7 priority 128 path cost 2000
        member: bge0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 2 priority 128 path cost 20000

epair1a: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether ff:ff:ff:ff:ff:ff
        inet6 fe80:ff:ff:ff:fe00:fff%epair1a prefixlen 64 scopeid 0x7 
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
$

@JAIL

$
epair1b: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether ff:ff:ff:ff:ff:ff
        inet 192.168.0.101 netmask 0xffffff00 broadcast 255.255.255.255 
        inet6 fe80:ff:ff:ff:fe00:fff%epair1b prefixlen 64 duplicated scopeid 0x2 
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
$

Ergänzungen:

Punkt 11.0: (06.05.2014) Jail – netstat support
-> dies erhöht den Zugriff auf das Speichermanagement (mit Vorsicht zu genießen) <- [bash]$ # netstat -nrfinet netstat: kvm not available: /dev/mem: No such file or directory Routing tables rt_tables: symbol not in namelist $[/bash] devfs Regel erstellen: [bash]$ vi /usr/local/etc/ezjail/test01 export jail_test01_devfs_ruleset="20" $[/bash] [bash]$ vi /etc/devfs.rules ### Jail - VIMAGE - // ### [devfsrules_jail_vimage=20] add include $devfsrules_hide_all add include $devfsrules_unhide_basic add include $devfsrules_unhide_login add path mem unhide add path kmem unhide ### // Jail - VIMAGE - ### $[/bash] [bash]$ service devfs restart $[/bash] Punkt 11.1: (06.05.2014) Jail – tcpdump support

$
tcpdump: no suitable device found
$

devfs Regel hinzufügen:

$
### Jail - VIMAGE // ###
[devfsrules_jail_vimage=20]
add path 'bpf*' unhide
### // Jail - VIMAGE ###
$

Punkt 11.2: (07.05.2014) Jail – tun dev für OpenVPN

devfs Regel hinzufügen:

$
### Jail - VIMAGE // ###
[devfsrules_jail_vimage=20]
add path 'tun*' unhide
### // Jail - VIMAGE ###
$

tun interface generieren lassen

$
vi /usr/local/etc/ezjail/test01

### OpenVPN // ###
export jail_test01_exec_prestart2="ifconfig tun0 create up"
export jail_test01_exec_poststart3="ifconfig tun0 vnet test01"
export jail_test01_exec_poststop1="ifconfig tun0 destroy"
### // OpenVPN ###
$

Punkt 11.3: (09.05.2014) Jail – IPv6

$
vi /usr/local/etc/ezjail/test01

export jail_test01_local_exec_poststart4="jexec test01_local /sbin/ifconfig epair1b inet6 ffff:ffff:ffff:ffff::ffff prefixlen 64"
export jail_test01_local_exec_poststart5="jexec test01_local /sbin/route add -inet6 default fe80::ffff:ffff:ffff:1dac%epair1b"
$

Punkt 11.4: (09.05.2014) Jail – Bridge für tap Interface (z.B. zum bridging mit VirtualBox)

$
vi /etc/sysctl.conf

### VIMAGE // ###
net.link.tap.user_open=1
### // VIMAGE ###

vi /etc/devfs.rules

add path 'tap*' mode 0660 group operator

vi /etc/rc.conf

cloned_interfaces="bridge0 lagg0 tap0"
ifconfig_tap0="up"
ifconfig_vswitch0="addm lagg0 addm tap0"
$

siehe devfs_system_ruleset Regel

Virtual HOST Machine Network Interface umstellen:

$
VBoxManage modifyvm yourmachine --bridgeadapter1 tap0
$

That’s FreeBSD