Message ID | 20210304025939.9164-1-dbuono@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
Series | meson: Stop if cfi is enabled with system slirp | expand |
On Wed, Mar 03, 2021 at 09:59:38PM -0500, Daniele Buono wrote: > For CFI, we need to compile slirp as a static library together with qemu. > This is because we register slirp functions as callbacks for QEMU Timers. > When using a system-wide shared libslirp, the type information for the > callback is missing and the timer call produces a false positive with CFI. Is there work being done, or at least an active plan, for fixing this ? Distros generally won't want to static link slirp to QEMU when there is a shared slirp available. It increases the security burden to maintain slirp twice, especially as slirp has a history of CVEs. IOW, the inability to use shared slirp may well prevent CFI from being used in distros. > > With this patch, meson will stop if CFI is enabled with system-wide slirp > > Signed-off-by: Daniele Buono <dbuono@linux.vnet.ibm.com> > --- > meson.build | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/meson.build b/meson.build > index f3db83e974..e1ec5020ac 100644 > --- a/meson.build > +++ b/meson.build > @@ -1569,6 +1569,18 @@ if have_system > endif > endif > > +# For CFI, we need to compile slirp as a static library together with qemu. > +# This is because we register slirp functions as callbacks for QEMU Timers. > +# When using a system-wide shared libslirp, the type information for the > +# callback is missing and the timer call produces a false positive with CFI. > +# > +# Now that slirp_opt has been defined, check if the selected slirp is compatible > +# with control-flow integrity. > +if get_option('cfi') and slirp_opt == 'system' > + error('Control-Flow Integrity is not compatible with system-wide slirp.' \ > + + ' Please configure with --enable-slirp=git') > +endif Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel
On 04/03/21 11:37, Daniel P. Berrangé wrote: > On Wed, Mar 03, 2021 at 09:59:38PM -0500, Daniele Buono wrote: >> For CFI, we need to compile slirp as a static library together with qemu. >> This is because we register slirp functions as callbacks for QEMU Timers. >> When using a system-wide shared libslirp, the type information for the >> callback is missing and the timer call produces a false positive with CFI. > > Is there work being done, or at least an active plan, for fixing this ? Daniele, would this work (uncompiled even)? diff --git a/net/slirp.c b/net/slirp.c index be914c0be0..82e05d2c01 100644 --- a/net/slirp.c +++ b/net/slirp.c @@ -174,23 +174,42 @@ static int64_t net_slirp_clock_get_ns(void *opaque) return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); } +typedef struct SlirpTimer { + QEMUTimer t; + SlirpTimerCb cb; + void *cb_opaque; +} SlirpTimer; + +static void slirp_timer_cb(void *opaque) +{ + SlirpTimer *st = opaque; + st->cb(st->cb_opaque); +} + static void *net_slirp_timer_new(SlirpTimerCb cb, void *cb_opaque, void *opaque) { - return timer_new_full(NULL, QEMU_CLOCK_VIRTUAL, - SCALE_MS, QEMU_TIMER_ATTR_EXTERNAL, - cb, cb_opaque); + SlirpTimer *st = g_new(SlirpTimer, 1); + st->cb = cb; + st->cb_opaque = cb_opaque; + timer_init_full(&st->t, NULL, QEMU_CLOCK_VIRTUAL, + SCALE_MS, QEMU_TIMER_ATTR_EXTERNAL, + slirp_timer_cb, st); + return st; } static void net_slirp_timer_free(void *timer, void *opaque) { - timer_free(timer); + SlirpTimer *st = timer; + timer_del(&st->t); + g_free(st); } static void net_slirp_timer_mod(void *timer, int64_t expire_timer, void *opaque) { - timer_mod(timer, expire_timer); + SlirpTimer *st = timer; + timer_mod(&st->t, expire_timer); } static void net_slirp_register_poll_fd(int fd, void *opaque)
On 3/4/2021 5:56 AM, Paolo Bonzini wrote: > On 04/03/21 11:37, Daniel P. Berrangé wrote: >> On Wed, Mar 03, 2021 at 09:59:38PM -0500, Daniele Buono wrote: >>> For CFI, we need to compile slirp as a static library together with >>> qemu. >>> This is because we register slirp functions as callbacks for QEMU >>> Timers. >>> When using a system-wide shared libslirp, the type information for the >>> callback is missing and the timer call produces a false positive with >>> CFI. >> >> Is there work being done, or at least an active plan, for fixing this ? > > Daniele, would this work (uncompiled even)? Basically, yes it would work, but I'd be worried of the effectiveness of CFI. Timers in QEMU are one of the easiest ways to change control flow when you get arbitrary write permissions, and were used in at least a couple of VM escape demos last year. CFI as before would stop that attack vector. With this patch, we would leave the attack vector essentially open. More comments (and a couple of suggestions to make it harder) later. > > diff --git a/net/slirp.c b/net/slirp.c > index be914c0be0..82e05d2c01 100644 > --- a/net/slirp.c > +++ b/net/slirp.c > @@ -174,23 +174,42 @@ static int64_t net_slirp_clock_get_ns(void *opaque) > return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); > } > > +typedef struct SlirpTimer { > + QEMUTimer t; > + SlirpTimerCb cb; > + void *cb_opaque; > +} SlirpTimer; > + > +static void slirp_timer_cb(void *opaque) > +{ > + SlirpTimer *st = opaque; > + st->cb(st->cb_opaque); > +} This call is still violating CFI (st->cb is a pointer in libslirp), but now that we have a specific callback for slirp, we can easily add a "QEMU_DISABLE_CFI" decorator to disable CFI only on `slirp_timer_cb`. The problem is that an attack on the timer is as easy now as it was without CFI. You just have to change the timer entry to call slirp_timer_cb, and the opaque pointer to a SlirpTimer struct you created, anywhere in memory, with a pointer to your own function and your own parameters. The call to slirp_timer_cb is valid for CFI, while the call to st->cb is not checked anymore. I believe we can "fix" this by manually (i.e. in the code) making sure that st->cb is a valid callback. I think there's a limited number of (possibly only one right now) callbacks that slirp will use on a timer. We could create a R/O array that contains the pointers to the allowed functions, and here check that st->cb is a pointer to one of those. A more generic alternative could be to try to use dladdr, to resolve the pointer to a symbol, and make sure that st->cb points to a symbol in libslirp. Not sure it will work (we are not opening libslirp with dlopen), and definitely heavy from a performance point of view, but would be probably a solution generic enough for all possible future cases. I can try to provide RFC patches for both cases, if you guys are interested, to see how the code would look like. > + > static void *net_slirp_timer_new(SlirpTimerCb cb, > void *cb_opaque, void *opaque) > { > - return timer_new_full(NULL, QEMU_CLOCK_VIRTUAL, > - SCALE_MS, QEMU_TIMER_ATTR_EXTERNAL, > - cb, cb_opaque); > + SlirpTimer *st = g_new(SlirpTimer, 1); > + st->cb = cb; > + st->cb_opaque = cb_opaque; > + timer_init_full(&st->t, NULL, QEMU_CLOCK_VIRTUAL, > + SCALE_MS, QEMU_TIMER_ATTR_EXTERNAL, > + slirp_timer_cb, st); > + return st; > } > > static void net_slirp_timer_free(void *timer, void *opaque) > { > - timer_free(timer); > + SlirpTimer *st = timer; > + timer_del(&st->t); > + g_free(st); > } > > static void net_slirp_timer_mod(void *timer, int64_t expire_timer, > void *opaque) > { > - timer_mod(timer, expire_timer); > + SlirpTimer *st = timer; > + timer_mod(&st->t, expire_timer); > } > > static void net_slirp_register_poll_fd(int fd, void *opaque) >
On 3/4/2021 5:37 AM, Daniel P. Berrangé wrote: > Is there work being done, or at least an active plan, for fixing this ? > > Distros generally won't want to static link slirp to QEMU when there is > a shared slirp available. It increases the security burden to maintain > slirp twice, especially as slirp has a history of CVEs. > > IOW, the inability to use shared slirp may well prevent CFI from being > used in distros. Daniel, Adoption is a very good point. We don't want to have multiple versions of the same library hanging around the O.S., unless strictly necessary. The problem (if I wear my security hat) is that, as you pointed out, slirp is known to have a history of CVEs, and it also rely heavily on callbacks and function pointers. So it would be one of the best candidates for CFI support. A (long-term) solution could be to compile libslirp as a shared library, WITH Control-Flow Integrity. Clang does have an experimental support for Cross-DSO CFI. However, it is not viable at the moment because: 1. It is still considered Experimental 2. It is not compatible with pointer type generalization (which we need because of Glib and other uses in QEMU). Cross-DSO CFI also have some performance implications but I think that would be a very small price to pay, and only in corner-case conditions. I don't want to bore anyone too much with the details of the implementation... Yet. I'd be happy to explain the Cross-DSO mechanism implemented by Clang if it is considered interesting here. The details can also be found here: https://clang.llvm.org/docs/ControlFlowIntegrity.html#shared-library-support And https://clang.llvm.org/docs/ControlFlowIntegrityDesign.html (Section "Shared library support") I think this would be the best long-term solution to improve security because it would allow to use CFI virtually on every library we consider security-sensitive, but not on the others. But it would require some help and work from/to the Clang community. In the short term, we should work out something similar to Paolo's approach. I'll add a few comments to his email.
On 05/03/21 17:52, Daniele Buono wrote: >> diff --git a/net/slirp.c b/net/slirp.c >> index be914c0be0..82e05d2c01 100644 >> --- a/net/slirp.c >> +++ b/net/slirp.c >> @@ -174,23 +174,42 @@ static int64_t net_slirp_clock_get_ns(void *opaque) >> return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); >> } >> >> +typedef struct SlirpTimer { >> + QEMUTimer t; >> + SlirpTimerCb cb; >> + void *cb_opaque; >> +} SlirpTimer; >> + >> +static void slirp_timer_cb(void *opaque) >> +{ >> + SlirpTimer *st = opaque; >> + st->cb(st->cb_opaque); >> +} > > This call is still violating CFI (st->cb is a pointer in libslirp), but > now that we have a specific callback for slirp, we can easily add a > "QEMU_DISABLE_CFI" decorator to disable CFI only on `slirp_timer_cb`. Ok, so let's go with your patch for now. Independently we could change libslirp to: - add a separate callback (timer_new_v2?) in SlirpCb. If it is set, we use it and pass an enum instead of the SlirpTimerCb cb. - add a function slirp_timer_expired(enum SlirpTimerId timer_id, void *cb_opaque) that does the indirection but without passing around an arbitrary function pointer. In 6.1 we will update the internal libslirp to a version that supports the new API, through a patch very similar to mine above. By requiring that new version as the minimum supported one, enabling CFI will be possible even with system slirp. You can open a slirp merge request at https://gitlab.freedesktop.org/slirp/libslirp. > The problem is that an attack on the timer is as easy now as it was > without CFI. You just have to change the timer entry to call > slirp_timer_cb, and the opaque pointer to a SlirpTimer struct you > created, anywhere in memory, with a pointer to your own function and > your own parameters. The call to slirp_timer_cb is valid for CFI, while > the call to st->cb is not checked anymore. I understand better now the situation, thanks for taking the time to explain it. Paolo
On Fri, Mar 05, 2021 at 11:53:07AM -0500, Daniele Buono wrote: > On 3/4/2021 5:37 AM, Daniel P. Berrangé wrote: > > Is there work being done, or at least an active plan, for fixing this ? > > > > Distros generally won't want to static link slirp to QEMU when there is > > a shared slirp available. It increases the security burden to maintain > > slirp twice, especially as slirp has a history of CVEs. > > > > IOW, the inability to use shared slirp may well prevent CFI from being > > used in distros. > > Daniel, > Adoption is a very good point. We don't want to have multiple versions > of the same library hanging around the O.S., unless strictly necessary. > > The problem (if I wear my security hat) is that, as you pointed out, > slirp is known to have a history of CVEs, and it also rely heavily on > callbacks and function pointers. So it would be one of the best > candidates for CFI support. > > A (long-term) solution could be to compile libslirp as a shared library, > WITH Control-Flow Integrity. Clang does have an experimental support for > Cross-DSO CFI. However, it is not viable at the moment because: > 1. It is still considered Experimental > 2. It is not compatible with pointer type generalization (which we need > because of Glib and other uses in QEMU). > Cross-DSO CFI also have some performance implications but I think that > would be a very small price to pay, and only in corner-case conditions. My concern is that libslirp is just showing us one known example of the problem. QEMU links to many more external libraries, which might exhibit similar issues. If we need to rebuild all the dependancies with CFI too, to be confident that the combined work will operate correctly, then this is quite a significant implication. Overall I think this is going to be a problem for the changes of distros adopting the use of CFI, especially if they're not using CLang as their toolchain. Regards, Daniel
On 08/03/21 12:19, Daniel P. Berrangé wrote: > My concern is that libslirp is just showing us one known example of > the problem. QEMU links to many more external libraries, which might > exhibit similar issues. We know exactly the issue: QEMU uses a pointer to a library function as a function pointer that is called *from QEMU* and not from another library. More issues like this (or other CFI issues such as the GLib casted event handlers) could arise if some but not all the dependencies are linked with CFI. But that should be handled at the distro level; if none of the dependencies use CFI, CFI incompatibility issues are fairly limited. Paolo > If we need to rebuild all the dependancies > with CFI too, to be confident that the combined work will operate > correctly, then this is quite a significant implication. Overall I > think this is going to be a problem for the changes of distros adopting > the use of CFI, especially if they're not using CLang as their toolchain.
On 3/8/2021 6:19 AM, Daniel P. Berrangé wrote: > My concern is that libslirp is just showing us one known example of > the problem. QEMU links to many more external libraries, which might > exhibit similar issues. If we need to rebuild all the dependancies > with CFI too, to be confident that the combined work will operate > correctly, then this is quite a significant implication. Overall I > think this is going to be a problem for the changes of distros adopting > the use of CFI, especially if they're not using CLang as their toolchain. In my opinion, there's no need to rebuild everything with CFI. There will be libraries that will benefit more from CFI, such as libslirp IMHO. But that still doesn't even mean that we need a CFI-enabled version to operate correctly. From a functional point of view, there are plenty of ways to have a CFI- enabled binary work with shared libraries that do not support CFI (or cross-dso CFI). From a security point of view it will be a trade-off. So I think we should study it on a per-library case to find out the best way forward. I believe in most cases, an approach like the one discussed with Paolo will be more than enough to get a good security level in QEMU, especially if the feature provided by the library is not used at runtime. > > Regards, > Daniel
On 3/5/2021 12:18 PM, Paolo Bonzini wrote: >> > > Ok, so let's go with your patch for now. Independently we could change > libslirp to: > > - add a separate callback (timer_new_v2?) in SlirpCb. If it is set, we > use it and pass an enum instead of the SlirpTimerCb cb. > > - add a function slirp_timer_expired(enum SlirpTimerId timer_id, void > *cb_opaque) that does the indirection but without passing around an > arbitrary function pointer. > > In 6.1 we will update the internal libslirp to a version that supports > the new API, through a patch very similar to mine above. By requiring > that new version as the minimum supported one, enabling CFI will be > possible even with system slirp. > > You can open a slirp merge request at > https://gitlab.freedesktop.org/slirp/libslirp. Hi Paolo, sounds like a good plan! I'll start working on this. Regards, Daniele
diff --git a/meson.build b/meson.build index f3db83e974..e1ec5020ac 100644 --- a/meson.build +++ b/meson.build @@ -1569,6 +1569,18 @@ if have_system endif endif +# For CFI, we need to compile slirp as a static library together with qemu. +# This is because we register slirp functions as callbacks for QEMU Timers. +# When using a system-wide shared libslirp, the type information for the +# callback is missing and the timer call produces a false positive with CFI. +# +# Now that slirp_opt has been defined, check if the selected slirp is compatible +# with control-flow integrity. +if get_option('cfi') and slirp_opt == 'system' + error('Control-Flow Integrity is not compatible with system-wide slirp.' \ + + ' Please configure with --enable-slirp=git') +endif + fdt = not_found fdt_opt = get_option('fdt') if have_system
For CFI, we need to compile slirp as a static library together with qemu. This is because we register slirp functions as callbacks for QEMU Timers. When using a system-wide shared libslirp, the type information for the callback is missing and the timer call produces a false positive with CFI. With this patch, meson will stop if CFI is enabled with system-wide slirp Signed-off-by: Daniele Buono <dbuono@linux.vnet.ibm.com> --- meson.build | 12 ++++++++++++ 1 file changed, 12 insertions(+)