Patchwork [2/2] VirtIO RNG

login
register
mail settings
Submitter Ian Molton
Date March 30, 2010, 2:13 p.m.
Message ID <4BB206EF.3030100@collabora.co.uk>
Download mbox | patch
Permalink /patch/48966/
State New
Headers show

Comments

Ian Molton - March 30, 2010, 2:13 p.m.
From 5f484301d73fa53009bbcd430f8ae85868b67772 Mon Sep 17 00:00:00 2001
From: Ian Molton <ian.molton@collabora.co.uk>
Date: Tue, 17 Nov 2009 14:34:12 +0000
Subject: [PATCH 2/4] virtio: Add virtio-rng driver

	This patch adds support for virtio-rng. Data is read from a chardev and
can be either raw entropy or received via the EGD protocol.

Signed-off-by: Ian Molton <ian.molton@collabora.co.uk>
---
 Makefile.target |    2 +-
 hw/pci.h        |    1 +
 hw/virtio-pci.c |   29 ++++++++
 hw/virtio-rng.c |  195
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/virtio-rng.h |   19 ++++++
 hw/virtio.h     |    2 +
 rng.h           |   14 ++++
 7 files changed, 261 insertions(+), 1 deletions(-)
 create mode 100644 hw/virtio-rng.c
 create mode 100644 hw/virtio-rng.h
 create mode 100644 rng.h
Paul Brook - April 1, 2010, 12:17 p.m.
>        This patch adds support for virtio-rng. Data is read from a chardev
> and can be either raw entropy or received via the EGD protocol.

I still don't get why you need this at all. It seems like virtio-serial would 
already provides everything you need.

> +        qemu_gettimeofday(&now);

Using qemu_gettimeofday is almost certainly wrong, and you want to be using 
virtual time.  Plus I'm not convinced this is the right place to enforce rate 
limiting.

Paul
Jamie Lokier - April 1, 2010, 12:30 p.m.
Paul Brook wrote:
> >        This patch adds support for virtio-rng. Data is read from a chardev
> > and can be either raw entropy or received via the EGD protocol.
> 
> I still don't get why you need this at all. It seems like
> virtio-serial would already provides everything you need.

I guess when virtio-rng was first written, virtio-serial wasn't
flexible enough because it didn't support multiple devices - and maybe
virtio-rng is still needed to enforce the EGD protocol when that is
being used.

> > +        qemu_gettimeofday(&now);
> 
> Using qemu_gettimeofday is almost certainly wrong, and you want to
> be using virtual time.  Plus I'm not convinced this is the right
> place to enforce rate limiting.

If it's for rate limiting how fast the guest can take entropy from the
host, to ensure the host's entropy cannot be exhausted by a single
greedy guest, then perhaps qemu_gettimeofday() is right here.  I doubt
if virtual time is right, at least not by itself.

I would hope that the host can rate limit itself without needing apps
to govern themselves, though.

-- Jamie
Paul Brook - April 1, 2010, 2:03 p.m.
>> >        This patch adds support for virtio-rng. Data is read from a
>> > chardev and can be either raw entropy or received via the EGD protocol.
>>
>> I still don't get why you need this at all. It seems like
>> virtio-serial would already provides everything you need.
>
>I guess when virtio-rng was first written, virtio-serial wasn't
>flexible enough because it didn't support multiple devices 

That argument no longer holds.

> and maybe virtio-rng is still needed to enforce the EGD protocol when that
> is being used.

Maybe, though the benefit of having this knowledge in wemu seems somewhat 
unclear. If we do want it then shouldn't be be implemented as a char device 
backend, rather than part of a specific serial port implementation?

> > > +        qemu_gettimeofday(&now);
> >
> > Using qemu_gettimeofday is almost certainly wrong, and you want to
> > be using virtual time.  Plus I'm not convinced this is the right
> > place to enforce rate limiting.
> 
> If it's for rate limiting how fast the guest can take entropy from the
> host, to ensure the host's entropy cannot be exhausted by a single
> greedy guest, then perhaps qemu_gettimeofday() is right here.  I doubt
> if virtual time is right, at least not by itself.

gettimeofday can and does jump arbitrarily. Comparing returned values is 
almost always wrong.

Paul
Ian Molton - April 2, 2010, 10:07 a.m.
Paul Brook wrote:
>>        This patch adds support for virtio-rng. Data is read from a chardev
>> and can be either raw entropy or received via the EGD protocol.
> 
> I still don't get why you need this at all. It seems like virtio-serial would 
> already provides everything you need.

Because we need to support the virtio-rng driver as it presently exists
in the linux kernel ?

>> +        qemu_gettimeofday(&now);
> 
> Using qemu_gettimeofday is almost certainly wrong, and you want to be using 
> virtual time.

I dont think so. What makes you say that ?

> Plus I'm not convinced this is the right place to enforce rate 
> limiting.

Given that some entropy sources enforce it and some do not, where else
would you suggest it be enforced?

This is now bikeshedding severely. We already had this discussion. I've
already modified the patch to comply with *every single one* of the
requests made of me. Frankly I'm getting fed up, and liable to stop
bothering submitting it at all.

If the patch wasn't going to be accepted in any form, it would have been
a courtesy to have just said it in the first place, rather than string
me along saying "well, it'll be fine if you just do..."

I think it speaks volumes that the kernel side of this patch (which
actually involved a complete rewrite of a chunk of the hwrandom code)
got reviewed, tweaked, and accepted inside a couple of weeks. I guess at
least some of my time spent on this want wasted after all...

-Ian
Ian Molton - April 2, 2010, 10:13 a.m.
Paul Brook wrote:
>>>>        This patch adds support for virtio-rng. Data is read from a
>>>> chardev and can be either raw entropy or received via the EGD protocol.
>>> I still don't get why you need this at all. It seems like
>>> virtio-serial would already provides everything you need.
>> I guess when virtio-rng was first written, virtio-serial wasn't
>> flexible enough because it didn't support multiple devices 
> 
> That argument no longer holds.

So now everything that looks like a stream of bytes has to use the
virtio-serial code...

Why? Its not like it'll make the rng device any simpler, smaller,
faster, or reduce its dependencies. Virtio is simple enough to begin with!

>> and maybe virtio-rng is still needed to enforce the EGD protocol when that
>> is being used.
> 
> Maybe, though the benefit of having this knowledge in wemu seems somewhat 
> unclear. If we do want it then shouldn't be be implemented as a char device 
> backend, rather than part of a specific serial port implementation?

Be my guest...

> gettimeofday can and does jump arbitrarily. Comparing returned values is 
> almost always wrong.

True, however the worst case is still a very temporary over-alotment of
entropy, which really isn't a problem.

-Ian
Ian Molton - April 2, 2010, 10:15 a.m.
Jamie Lokier wrote:

> I would hope that the host can rate limit itself without needing apps
> to govern themselves, though.

That depends on the source of the entropy - some sources can, some are
fed to daemons that can, but not all systems have such daemons.

:)

-Ian
Paul Brook - April 3, 2010, 3:06 p.m.
> Paul Brook wrote:
> >>>>        This patch adds support for virtio-rng. Data is read from a
> >>>> chardev and can be either raw entropy or received via the EGD
> >>>> protocol.
> >>>
> >>> I still don't get why you need this at all. It seems like
> >>> virtio-serial would already provides everything you need.
> >>
> >> I guess when virtio-rng was first written, virtio-serial wasn't
> >> flexible enough because it didn't support multiple devices
> >
> > That argument no longer holds.
> 
> So now everything that looks like a stream of bytes has to use the
> virtio-serial code...

IMO, yes. That's the whole point of virtio-serial.

You can handle it however you like in your in your favourite guest OS, but I 
object to qemu having multiple virtio devices that look and smell exactly like 
a serial port.

>Why? Its not like it'll make the rng device any simpler, smaller,
>faster, or reduce its dependencies. Virtio is simple enough to begin with!

I disagree. Past experience shows that it's more than possible to completely 
screw up something as simple as a serial port. If everything that looks like a 
serial port provides their own implementation then the chances are they'll all 
be missing something, and broken in different ways.

Take rate limiting as an example: Your implementation has at least one 
outright bug (using gettimeofday), and various questionable characteristics 
(is a 1-second token bucket appropriate).

The EGD protocol bits are another example. You aren't exposing any of the 
interesting bits of this protocol to the guest, so it seems entirely 
reasonable to connect this to a serial port. That allows it to be used by 
guests that don't have a virtio-rng driver.

Paul
Ian Molton - April 13, 2010, 2:41 p.m.
Paul Brook wrote:
>> Paul Brook wrote:

>> So now everything that looks like a stream of bytes has to use the
>> virtio-serial code...
> 
> IMO, yes. That's the whole point of virtio-serial.
> 
> You can handle it however you like in your in your favourite guest OS, but I 
> object to qemu having multiple virtio devices that look and smell exactly like 
> a serial port.

I dont think it looks and smells exactly like a serial port. The code is
simpler to begin with, nor does it support any out-of-band signalling.

>> Why? Its not like it'll make the rng device any simpler, smaller,
>> faster, or reduce its dependencies. Virtio is simple enough to begin with!
> 
> I disagree. Past experience shows that it's more than possible to completely 
> screw up something as simple as a serial port.

Straw man. Serial ports might be fairly simple things, but my device is
simpler still than most of them, and you're talking about emulating a
serial port, where you have control lines, line disciplines, etc. to
emulate, all of which need to be right. None of that is a concern in my
driver.

> If everything that looks like a 
> serial port provides their own implementation then the chances are they'll all 
> be missing something, and broken in different ways.

Better never write any code then, its garunteesd to have a bug in it.

I *know* what you're getting at - I removed hundreds of lines of common
code copied and randomly, brokenly, modified from the various ASoC
kernel drivers a few months back.

But this bit of code doesnt really share enough in common with anything
else to make it worth handling your way. Alternatively, prove me wrong!

> Take rate limiting as an example: Your implementation has at least one 
> outright bug (using gettimeofday),

Arguably a bug. I'd disagree that it causes any problems in practice
(qemu_gettimeofday, btw - which I was specifically requested to use
during an earlier review of this patch...)

> and various questionable characteristics (is a 1-second token bucket appropriate).

So what? if people dont like the default policy, or its too limited,
they can feel free to contribute patches. Its hardly like this is core
code that minute changes will require months of review to ratify!

At least I'm doing something about the
total-lack-of-any-support-whatsoever situation.

> The EGD protocol bits are another example. You aren't exposing any of the 
> interesting bits of this protocol to the guest, so it seems entirely 
> reasonable to connect this to a serial port.

Except that then we need to have a little daemon to feed rate limited
EGD protocol data to said serial port.

So now you've got the situation of the USB-serial entropy key, feeding
the (potentially) proprietary egd-compatible-but-not-rate-limited daemon
 that feeds a rate limiting daemon into a serial port on qemu that feeds
the guest OS.

So this is running in a server farm with 64 or 128 guests sat on it.
each one with its own silly rate limiting daemon / serial port mux from
the egd daemon the USB stick feeds. What happens when you apt-get
upgrade and the egd daemon gets replaced? you really want to notify and
restart all 128 noddy mux daemons and reconnect qemu to them? Why dont
we just kill the guests and reboot them all too?

My patchset handled this case without any problems at all (the reader
would simply (and without blocking) reconect to the server).

The point is that you can take 'generic support' too far. Sure you CAN
model this as a bunch of serial ports, and feed them from rate limiting
daemons, which in turn read from some egd-speaking daemon or hardware,
but EGD protocol is *SO* simple. Why on earth go to these lengths when
it can be implemented and embedded in a handful of lines of code?

The chardev reconnect stuff was in the generic qemu chardev layer and
thus reuseable by other parts of qemu that need to maintain a persistent
state even when connected to other processes via sockets, pipes, fifos etc.

> That allows it to be used by guests that don't have a virtio-rng driver.

Thats already supported by use of TCP tunnels. but thats 1) not
efficient and 2) not very secure - few sources of entropy provide crypto
to garuntee that the data source hasnt been frobbed.   3) much more
vulnerable to attack from processes running on the guest (TCP is much
more exposde than the rng-core internals)

So, rather than bike-shedding, how about making some kind of decision?

Either:

1) Dictate how the code is going to be written (in which case, since you
know it so well, write it yourself)

or:

2) Accept that not everyone likes your (idea for a) implementation, that
prior interfaces exist on the guest side, and that a perfectly good and
non-intrusive way of doing it on qemu has beeen written for you.
(including a couple of other nice features / fixes along the way, like
adding the missing SIZE_ property, socket reconnect support, etc.)

Bear in mind that its rather unfair to (as has been in this case) have a
patch reviewed, modified based on criticism, and then rejected after
effort has been wasted working on it.

-Ian
Paul Brook - April 13, 2010, 3:01 p.m.
> Bear in mind that its rather unfair to (as has been in this case) have a
> patch reviewed, modified based on criticism, and then rejected after
> effort has been wasted working on it.

I'm pretty sure I raised all these issues the first time this code was 
submitted.

Paul
Paul Brook - April 13, 2010, 3:32 p.m.
> So, rather than bike-shedding, how about making some kind of decision?

Ok, let me make this simple.

Features such as rate limiting and EGD protocol translation should not be part 
of individual device emulation. They are part of the host interface, not the 
guest machine.  If you really want these in qemu then they should be 
implemented as part of the chardev layer.

It may be a bit more work to implement these properly. However I'm not going 
to accept a half-assed implementation just because someone wrote it.


A direct result of this is that virtio-rng degenerates to a crippled virtual 
serial port. We already have a virtual serial port implementation designed for 
exactly this kind of application.  IMHO virtio-rng is a complete waste of time 
and space.

Paul
Ian Molton - April 20, 2010, 3:15 p.m.
Paul Brook wrote:
>> So, rather than bike-shedding, how about making some kind of decision?
> 
> Ok, let me make this simple.

Great!

> Features such as rate limiting and EGD protocol translation should not be part 
> of individual device emulation. They are part of the host interface, not the 
> guest machine.  If you really want these in qemu then they should be 
> implemented as part of the chardev layer.

Well thats a bit more descriptive than a NAK at last. Thankyou.

So, before I bother commiting work time to this, will a patch to
implement rate limiting in the chardev layer be accepted in principle?

Is there going to be a fuss over what *kind* of rate limiting? Am I
going to have to implement it both for in coming AND outgoing streams?

Personally, I'm not interested in implementing more than basic rate
limiting, MAYBE with a burst limit as well, and only on incoming streams.

Is that going to be acceptable? If not, then what?

> It may be a bit more work to implement these properly. However I'm not going 
> to accept a half-assed implementation just because someone wrote it.

You cant realistically expect someone to implement a feature across an
entire (sub)system every time a single use-case is found.

I actually like the idea of putting it in the chardev layer, so I dont
mind doing the rejiggery to implement that, as long as I'm not going to
be forced to implement solutions for every corner case in creation (I
only have a small test case and less than infinite time)

> A direct result of this is that virtio-rng degenerates to a crippled virtual 
> serial port.

And a very small, very simple driver with near on nothing to go wrong.
The kernel interface already exists, and has been debugged (it used to
have a major alignment-based bug) by me. Its in use by other
hypervisors. It exists, and people want support for it. And I want to
write the support.

Just because *you* don't think its useful doesn't mean its inherently
bad - and its a device driver - if you don't like it, don't build it in,
just like I don't build the cirrus logic VGA driver in.

I could understand your POV if you were saying that (once the rate
limiting is moved to the chardev layer) the driver was actually broken,
but as it *isnt* and it supports a pre-existing kernel virtio device, I
don't see why you have a problem with it.

In fact, if no other users of the kernels virtio-rng driver existed,
then I'd (almost) agree with you. But they do, and with the rate
limiting moved into a chardev, the driver becomes simpler than the
serial drivers.

One last and quite important point - where should the EGD protocol
implementation go? really it needs to work as kind-of a line discipline,
but AFAICT thats not supported? it would be a mistake to put rate
limiting in the chardev layer and leave the EGD protocol implementation
in the driver.

Ideally, the layering would be

EGD -> rate limiting -> device driver

But I don't see quite how this will fit into the chardev layer cleanly.

TBH, for the sake of one very simple driver, and given that apparently
no other users in qemu seem to want rate-limiting, *I( think that you
are massively over-complicating matters right now. If more drivers need
rate limiting, perhaps, but that doesnt seem to be the case.

> We already have a virtual serial port implementation designed for 
> exactly this kind of application.

Except that it doesn't speak to the kernels virtio-rng implementation.
And that interface is not going to change just because you don't like
it. (Unless you'd like to rewrite the kernels hwrng core? feel free! I'm
sure it'd be appreciated - but if not, then don't complain)

> IMHO virtio-rng is a complete waste of time and space.

To you. Not everyone. Unfortunately, qemu is missing some infrastructure
(like the SIZE property and socket reconnects) right now that would make
it useful.

Could I ask that you review and apply the SIZE property patch and the
reconnect patch? I've modified both of those in line with earlier
criticism, and their acceptance should be orthogonal to the issues
surrounding virtio-rng. Both patches provide useful core functionality.
They were submitted as part of the same patchset, but I'll happily break
them out if required. I've seen no negative remarks about those patches
this time around, so it should be time to apply them.

-Ian
Jamie Lokier - April 20, 2010, 4:13 p.m.
Ian Molton wrote:
> One last and quite important point - where should the EGD protocol
> implementation go? really it needs to work as kind-of a line discipline,
> but AFAICT thats not supported? it would be a mistake to put rate
> limiting in the chardev layer and leave the EGD protocol implementation
> in the driver.

What do the other hypervisors supporting virtio-rng do?

Personally I'm failing to see why EGD support is needed in Qemu, as
none of the crypto services on my Linux machines seem to need it so
why should Qemu be special, but I acknowledge there might be some
obscure reason.

> TBH, for the sake of one very simple driver, and given that apparently
> no other users in qemu seem to want rate-limiting, *I( think that you
> are massively over-complicating matters right now. If more drivers need
> rate limiting, perhaps, but that doesnt seem to be the case.

Rate limiting both networking and serial ports may be a handy little
option sometimes.  Iirc, there are some guests which get confused when
data transfers are gazillions times faster than they expected, or
gazillions times more bursty in the case of networking.

> > We already have a virtual serial port implementation designed for 
> > exactly this kind of application.
> 
> Except that it doesn't speak to the kernels virtio-rng implementation.
> And that interface is not going to change just because you don't like
> it. (Unless you'd like to rewrite the kernels hwrng core? feel free! I'm
> sure it'd be appreciated - but if not, then don't complain)

Would it be much work to change the guest to use virtio-serial
instead?  Would it fit the problem or does virtio-rng need more
metadata than just a bytestream?

-- Jamie
Ian Molton - April 20, 2010, 7:52 p.m.
Jamie Lokier wrote:

Hi :-)

> Ian Molton wrote:
>> One last and quite important point - where should the EGD protocol
>> implementation go? really it needs to work as kind-of a line discipline,
>> but AFAICT thats not supported? it would be a mistake to put rate
>> limiting in the chardev layer and leave the EGD protocol implementation
>> in the driver.
> 
> What do the other hypervisors supporting virtio-rng do?

Um. support it? Im not sure what you mean there.

> Personally I'm failing to see why EGD support is needed in Qemu, as
> none of the crypto services on my Linux machines seem to need it so
> why should Qemu be special, but I acknowledge there might be some
> obscure reason.

I know seevral people who use egd based hardware on their hosts who want
 virtio-rng support in their guests - it avoids having to route the data
via TCP or other long winded paths that are trivially vulverable to
attack both on the host and from the guests userspace. The hwrng core is
inherently very simple.

>> TBH, for the sake of one very simple driver, and given that apparently
>> no other users in qemu seem to want rate-limiting, *I( think that you
>> are massively over-complicating matters right now. If more drivers need
>> rate limiting, perhaps, but that doesnt seem to be the case.
> 
> Rate limiting both networking and serial ports may be a handy little
> option sometimes.

Perhaps, but I'm not a big user of those systems and as such Im not
really in a position to design a rate limiting algorithm thats going to
be really appropriate without spending a lot of time I dont have looking
at stuff I don't really care about.

I don't mind implementing trivial rate/burst limiting in the chardev
layer if thats whats wanted, but I have neither the time nor inclination
to write a thesis on the subject :-)

>  Iirc, there are some guests which get confused when
> data transfers are gazillions times faster than they expected, or
> gazillions times more bursty in the case of networking.

Realistically, that kind of limiting might be best off in the device
drivers those systems use, which has more intimate knowledge of how the
virtual hardware should behave. Without extensive testing I just couldnt
answer that.

>>> We already have a virtual serial port implementation designed for 
>>> exactly this kind of application.
>> Except that it doesn't speak to the kernels virtio-rng implementation.
>> And that interface is not going to change just because you don't like
>> it. (Unless you'd like to rewrite the kernels hwrng core? feel free! I'm
>> sure it'd be appreciated - but if not, then don't complain)
> 
> Would it be much work to change the guest to use virtio-serial
> instead?  Would it fit the problem or does virtio-rng need more
> metadata than just a bytestream?

If anything virtio-rng uses *less* info than the virtio-serial driver
(it has one stream, in one direction, only).

But the kernel already has a virtio-rng driver and it already works in a
particular way, which is already implemented in other hypervisors. I
didn't write the kernel driver, it is pre-existing. I suppose we could
ask the kernel devs if they'd like another virtio-based rng driver in
addition to the one already in use? but that seems perverse.

Why even bother with virtio at all if you're going to abstract
everything away behind a serial port? we could just modify all the guest
kernel virtio-block drivers to serialise all their metadata, then we
could use serial ports for that too!

Abstraction can be taken too far.

All MHO but I'm standing by what I said. If I can get a reasonable
consensus about rewriting the code so that we get all the features
needed for an enterprise level solution, namely

* Fault tolerance - socket reconnection support.
* EGD protocol support - because its what the users of this code (not me
as such, I don't actually use it at all myself) actually want
* virtio-rng support - because anything else is stupid and would involve
getting a second virtio-serial-rng driver into the kernel.

Then great.

IMHO, the socket reconnect patch and the SIZE parameter patch fix
obvious flaws in qemu and should be applied now, unless someone can come
up with a good reason why they shouldn't.

We can carry on debating the actual virtio-rng driver if thats whats
wanted, but I have stated my position. So far the only argument against
is "well I wouldn't choose to do it that way", which is hardly concrete
reasoning.

Please, please, can we get out of the bikeshed?

-Ian
Blue Swirl - April 20, 2010, 8:11 p.m.
On 4/20/10, Ian Molton <ian.molton@collabora.co.uk> wrote:
> Jamie Lokier wrote:
>
>  Hi :-)
>
>  > Ian Molton wrote:
>  >> One last and quite important point - where should the EGD protocol
>  >> implementation go? really it needs to work as kind-of a line discipline,
>  >> but AFAICT thats not supported? it would be a mistake to put rate
>  >> limiting in the chardev layer and leave the EGD protocol implementation
>  >> in the driver.
>  >
>  > What do the other hypervisors supporting virtio-rng do?
>
>  Um. support it? Im not sure what you mean there.
>
>  > Personally I'm failing to see why EGD support is needed in Qemu, as
>  > none of the crypto services on my Linux machines seem to need it so
>  > why should Qemu be special, but I acknowledge there might be some
>  > obscure reason.
>
>  I know seevral people who use egd based hardware on their hosts who want
>   virtio-rng support in their guests - it avoids having to route the data
>  via TCP or other long winded paths that are trivially vulverable to
>  attack both on the host and from the guests userspace. The hwrng core is
>  inherently very simple.
>
>  >> TBH, for the sake of one very simple driver, and given that apparently
>  >> no other users in qemu seem to want rate-limiting, *I( think that you
>  >> are massively over-complicating matters right now. If more drivers need
>  >> rate limiting, perhaps, but that doesnt seem to be the case.
>  >
>  > Rate limiting both networking and serial ports may be a handy little
>  > option sometimes.
>
>  Perhaps, but I'm not a big user of those systems and as such Im not
>  really in a position to design a rate limiting algorithm thats going to
>  be really appropriate without spending a lot of time I dont have looking
>  at stuff I don't really care about.
>
>  I don't mind implementing trivial rate/burst limiting in the chardev
>  layer if thats whats wanted, but I have neither the time nor inclination
>  to write a thesis on the subject :-)
>
>  >  Iirc, there are some guests which get confused when
>  > data transfers are gazillions times faster than they expected, or
>  > gazillions times more bursty in the case of networking.
>
>  Realistically, that kind of limiting might be best off in the device
>  drivers those systems use, which has more intimate knowledge of how the
>  virtual hardware should behave. Without extensive testing I just couldnt
>  answer that.
>
>  >>> We already have a virtual serial port implementation designed for
>  >>> exactly this kind of application.
>  >> Except that it doesn't speak to the kernels virtio-rng implementation.
>  >> And that interface is not going to change just because you don't like
>  >> it. (Unless you'd like to rewrite the kernels hwrng core? feel free! I'm
>  >> sure it'd be appreciated - but if not, then don't complain)
>  >
>  > Would it be much work to change the guest to use virtio-serial
>  > instead?  Would it fit the problem or does virtio-rng need more
>  > metadata than just a bytestream?
>
>  If anything virtio-rng uses *less* info than the virtio-serial driver
>  (it has one stream, in one direction, only).
>
>  But the kernel already has a virtio-rng driver and it already works in a
>  particular way, which is already implemented in other hypervisors. I
>  didn't write the kernel driver, it is pre-existing. I suppose we could
>  ask the kernel devs if they'd like another virtio-based rng driver in
>  addition to the one already in use? but that seems perverse.
>
>  Why even bother with virtio at all if you're going to abstract
>  everything away behind a serial port? we could just modify all the guest
>  kernel virtio-block drivers to serialise all their metadata, then we
>  could use serial ports for that too!
>
>  Abstraction can be taken too far.
>
>  All MHO but I'm standing by what I said. If I can get a reasonable
>  consensus about rewriting the code so that we get all the features
>  needed for an enterprise level solution, namely
>
>  * Fault tolerance - socket reconnection support.
>  * EGD protocol support - because its what the users of this code (not me
>  as such, I don't actually use it at all myself) actually want
>  * virtio-rng support - because anything else is stupid and would involve
>  getting a second virtio-serial-rng driver into the kernel.
>
>  Then great.
>
>  IMHO, the socket reconnect patch and the SIZE parameter patch fix
>  obvious flaws in qemu and should be applied now, unless someone can come
>  up with a good reason why they shouldn't.
>
>  We can carry on debating the actual virtio-rng driver if thats whats
>  wanted, but I have stated my position. So far the only argument against
>  is "well I wouldn't choose to do it that way", which is hardly concrete
>  reasoning.
>
>  Please, please, can we get out of the bikeshed?

I would not call the discussions bikeshedding but architecture design
steering and project leadership.
Jamie Lokier - April 20, 2010, 8:56 p.m.
Ian Molton wrote:
> Jamie Lokier wrote:
> 
> Hi :-)
> 
> > Ian Molton wrote:
> >> One last and quite important point - where should the EGD protocol
> >> implementation go? really it needs to work as kind-of a line discipline,
> >> but AFAICT thats not supported? it would be a mistake to put rate
> >> limiting in the chardev layer and leave the EGD protocol implementation
> >> in the driver.
> > 
> > What do the other hypervisors supporting virtio-rng do?
> 
> Um. support it? Im not sure what you mean there.

Do the other hypervisors do anything special to support EGD, or is it
just treated as another kind of serial port connected to /dev/urandom
(or whatever) on the host?

> > Personally I'm failing to see why EGD support is needed in Qemu, as
> > none of the crypto services on my Linux machines seem to need it so
> > why should Qemu be special, but I acknowledge there might be some
> > obscure reason.
> 
> I know seevral people who use egd based hardware on their hosts who want
> virtio-rng support in their guests

That's a fine reason.

> - it avoids having to route the data
> via TCP or other long winded paths that are trivially vulverable to
> attack both on the host and from the guests userspace. The hwrng core is
> inherently very simple.

Wouldn't it be trivial to run egd on the guest, talking over the
recent virtio-serial driver to egd on the host?

> > Would it be much work to change the guest to use virtio-serial
> > instead?  Would it fit the problem or does virtio-rng need more
> > metadata than just a bytestream?
> 
> If anything virtio-rng uses *less* info than the virtio-serial driver
> (it has one stream, in one direction, only).
> 
> But the kernel already has a virtio-rng driver and it already works in a
> particular way, which is already implemented in other hypervisors. I
> didn't write the kernel driver, it is pre-existing. I suppose we could
> ask the kernel devs if they'd like another virtio-based rng driver in
> addition to the one already in use? but that seems perverse.

I agree with you on that.

> Why even bother with virtio at all if you're going to abstract
> everything away behind a serial port? we could just modify all the guest
> kernel virtio-block drivers to serialise all their metadata, then we
> could use serial ports for that too!

No, the point is virtio-rng *is* already a byte stream.  It's already
a perfect match for a serial port, unless there's something you're not
saying about it.  Does it need tell the host how much entropy it
needs, or does it just trickle through, stealing host entropy when it
doesn't need it, and stalling the guest when the trickle isn't enough?
If it's dumb enough to just trickle the entropy through, that would
seem a very good match to virtio-serial - apart from the guests which
already expect virtio-rng support.

-- Jamie
Ian Molton - April 20, 2010, 9:31 p.m.
Jamie Lokier wrote:
> Ian Molton wrote:
>> Jamie Lokier wrote:

>>> What do the other hypervisors supporting virtio-rng do?
>> Um. support it? Im not sure what you mean there.
> 
> Do the other hypervisors do anything special to support EGD, or is it
> just treated as another kind of serial port connected to /dev/urandom
> (or whatever) on the host?

They treat it as a paravirtualised PCI RNG device :-)

The hypervisor reads /dev/{u,}random *only* in at least one
implementation (or did at the time of writing and I doubt its changed).
It was, IIRC, somewhat criticised for not applying rate limiting in the
hypervisor, allowing one guest to steal all the hosts available entropy
- something I sought to avoid in my implementation.

However thats a hypervisor issue, not a protocol issue. The fact is that
the kernel has a channel for receiving RNG data from a hypervisor. That
channel is virtio-rng.

Yes one can do it other ways. In fact the whole reason I wrote the code
is because people who *are* doing it 'other ways' wanted it done this
way, like the other hypervisors.

> Wouldn't it be trivial to run egd on the guest, talking over the
> recent virtio-serial driver to egd on the host?

Yes, but that isn't the point. Many problems have more than one
solution. Using virtio-rng means that the data is going into the guest
kernels hwrng subsystem. Wether this is the right way or not for a given
user really isnt for me or anyone else to argue - its more than
efficient enough, and it means that the exact same tools that run on
machines with actual RNG hardware can be used without change. If nothing
else, its good for testing those tools.

>> Why even bother with virtio at all if you're going to abstract
>> everything away behind a serial port? we could just modify all the guest
>> kernel virtio-block drivers to serialise all their metadata, then we
>> could use serial ports for that too!
> 
> No, the point is virtio-rng *is* already a byte stream.  It's already
> a perfect match for a serial port,

Sure, *any* byte stream would be. thats what serial ports /do/.

But its just not the point. Im not here to argue if one solution or
another is the worlds best One True Way (tm). I've merely implemented
one solution in qemu that other hypervisors  *AND* the kernel already
support, and that users want. Its non-invasive. It performs well. It
passes FIPS-140. What more could anyone want? Not including a feature
beacuse you might do it differently is simply not an argument, unless
your solution is better in some way, like, say, it performs massively
better, or its hugely smaller... so lets see!

Pros:
virtio-rng
------------
 * uses pre-existing kernel infrastructure
 * quick
 * tested to pass FIPS-140 (statistically no worse than the host
                            natively (FIPS-140 is a probabalistic
                            test, some failures are the nature of
                            the beast))
 * very small driver (smaller than a serial driver + serial core
 * does not require virtio-serial core
 * does not require unusual and untested daemons on the guest

virtio-serial
-------------
 * um. Help me out here?

Cons:
virtio-rng
-------------
 * I guess if you have it *and* the virtio-serial core, its a tiny
   bit bigger. *But* thats probably offset by not needing a special
   daemon on the guest, so not really a point...

virtio-serial
-------------
 * support daemons not written yet or abusing other daemons in ways that
   aren't commonly done.
 * untested (believe me, its easy to provoke subtle FIPS-140 FAILs)
 * requires the use of the virtio-serial core even if no other serial
   drivers are in use (overkill).
 * Impossible to interface with the kernels hwrng core and thus
   useless for debugging it and related tools on the guest.

> If it's dumb enough to just trickle the entropy through, that would
> seem a very good match to virtio-serial - apart from the guests which
> already expect virtio-rng support.        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Isn't that latter feature a good enough reason on its own? qemu supports
a LOT of obscure hardware already - so "its weird" is also NOT a good
reason against it. And its not even weird or unusual in this instance.

If qemu is only to support one way of doing any given operation then we
should get rid of a LOT of other drivers - like most of the video
drivers for example, and IDE/SCSI/SATA - well, all guests should just
use virtio-block, right?

:-D

-Ian
Jamie Lokier - April 20, 2010, 9:55 p.m.
Ian Molton wrote:
> I've merely implemented one solution in qemu that other hypervisors
> *AND* the kernel already support, and that users want.

I think that's a good reason to include virtio-rng support in some form.

I suspect Paul's objection may have been due to an impression that
virtio-rng was a new thing developed by you, or an obscure thing that
nobody uses at the moment.

But: Does it have many users at present?

> virtio-serial
> -------------
>  * Impossible to interface with the kernels hwrng core and thus
>    useless for debugging it and related tools on the guest.

That was my question earlier: Impossible?  Seems like it should be
easy.  The kernel can bind its console to virtio-serial, so it should
be able to bind its hwrng to another one

> > If it's dumb enough to just trickle the entropy through, that would
> > seem a very good match to virtio-serial - apart from the guests which
> > already expect virtio-rng support.        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Isn't that latter feature a good enough reason on its own? qemu supports
> a LOT of obscure hardware already - so "its weird" is also NOT a good
> reason against it. And its not even weird or unusual in this instance.

I agree, if there are more than a few obscure guests depending on it already.

However if it's only a handful and they can easily be upgraded to a
newer kernel using something else, maybe it's not worth it.

Your patches don't just add another device emulation.  They're a bit
more invasive than that.

> If qemu is only to support one way of doing any given operation then we
> should get rid of a LOT of other drivers - like most of the video
> drivers for example, and IDE/SCSI/SATA - well, all guests should just
> use virtio-block, right?

Sometimes, when I notice the increasing number of bug reports that
older guest OSes no longer work, I get the impression that's what KVM
developers would prefer :-)

For a certain class of users that's right.  Some users (generally the
"virtualization farm" variety) only need to run recent guests which
support virtio.

-- Jamie
Gerd Hoffmann - April 21, 2010, 7:43 a.m.
On 04/20/10 23:31, Ian Molton wrote:

> Using virtio-rng means that the data is going into the guest
> kernels hwrng subsystem.

Which is *the* major advantage of the virtio-rng driver.  In case the 
guest kernel is recent enougth to have support for it, it will 
JustWork[tm].  No need for guest configuration, no need for some 
userspace tool.  I'd like to see this driver being merged.

With any kind of serial port (be it a emulated 16550 or virtio-serial) 
you'll need some daemon running inside the guest grabbing entropy data 
from the port and feeding it back into the kernel.  Nobody prevents you 
from doing that if you want (or have to due to lack of virtio-rng 
support in the guest).  Just link the port with the egd chardev and 
configure the guest accordingly.

cheers,
   Gerd
Jamie Lokier - April 21, 2010, 9:40 a.m.
Gerd Hoffmann wrote:
> On 04/20/10 23:31, Ian Molton wrote:
> 
> >Using virtio-rng means that the data is going into the guest
> >kernels hwrng subsystem.
> 
> Which is *the* major advantage of the virtio-rng driver.  In case the 
> guest kernel is recent enougth to have support for it, it will 
> JustWork[tm].  No need for guest configuration, no need for some 
> userspace tool.  I'd like to see this driver being merged.
> 
> With any kind of serial port (be it a emulated 16550 or virtio-serial) 
> you'll need some daemon running inside the guest grabbing entropy data 
> from the port and feeding it back into the kernel.

That's a bunch of false assumptions.

There's no reason a hwrng connector to virtio-serial could not be
automatic in a similar way to the console.

But enough of that: It's history now; the guest virtio-rng has existed
for more than a year.  It is also amazingly short and simple.  Yay for Rusty!

I don't object to virtio-rng; I think it's fine in principle and would
be happy to see the existing guests which have a virtio-rng driver
make use of it.

A bit of overlapping functionality is rife in emulators anyway :-)

-- Jamie
Ian Molton - April 21, 2010, 12:34 p.m.
Jamie Lokier wrote:

> But enough of that: It's history now; the guest virtio-rng has existed
> for more than a year.  It is also amazingly short and simple.  Yay for Rusty!

Ok, so...

If we can now take steps to integrate the code...

I'd like to get some feedback (or merging!) of the other parts of the
patch, namely:-

* the SIZE property patch:    Msg-Id: <4BB206B9.80501@collabora.co.uk>
* the socket reconnect patch: Msg-Id: <4B18055B.1030606@collabora.co.uk>

If we can get these patches sorted out, the only outstanding issues are
where the EGD protocol support and rate limiting should go.

IMO, both are so small and simple that I'd leave them in the driver,
however I'm open to (realistic) suggestions on how else to go about
this. AIUI qemus chardev layer doesnt 'do' line disciplines, which
complicates matters.

For reference, the EGD protocol is implemented in 5 lines of code, and
rate limiting in about 10.

I don't mind changing qemu_gettimeofday for some other counter (does
qemu provide a monotonic counter (from the hosts perspective) of some
sort?) - if anyone is /really/ bothered by the fact that one buffer fill
could happen at 2x the set rate should the timeofday be glitched.

:-D

-Ian
Gerd Hoffmann - April 21, 2010, 1:55 p.m.
Hi,

> * the SIZE property patch:    Msg-Id:<4BB206B9.80501@collabora.co.uk>

Fine with me.

> * the socket reconnect patch: Msg-Id:<4B18055B.1030606@collabora.co.uk>

Not sure yet.

> If we can get these patches sorted out, the only outstanding issues are
> where the EGD protocol support and rate limiting should go.

I think it makes sense to have a separate chardev backend for it, so you 
can easily hook it up to either virtio-rng or something else, i.e. 
define a chardev for the egd connection like this:

-chardev backend=egd,id=egd,server=$address,$rate-limit-options-here

then link it to virtio-rng ...

-device virtio-rng-pci,chardev=egd

... or virtio-serial ...

-device virtserialport,chardev=egd,name=org.qemu.egd

... or a emulated 16650 ....

-device isa-serial,chardev=egd

... depending on what you prefer and what your guest can handle.

It might make sense to have the reconnect logic in the egd chardev 
backend then, thereby obsoleting the socket reconnect patch.

cheers,
   Gerd
Ian Molton - April 22, 2010, 7:06 p.m.
Gerd Hoffmann wrote:
>   Hi,
> 
>> * the SIZE property patch:    Msg-Id:<4BB206B9.80501@collabora.co.uk>
> 
> Fine with me.

\o/

So should I re-post that patch, or can I count on that being folded into
mainline ?

>> * the socket reconnect patch: Msg-Id:<4B18055B.1030606@collabora.co.uk>
> 
> Not sure yet.

Comment below...

> I think it makes sense to have a separate chardev backend for it, so you
> can easily hook it up to either virtio-rng or something else, i.e.
> define a chardev for the egd connection like this:
> 
> -chardev backend=egd,id=egd,server=$address,$rate-limit-options-here

Yes, I like the look of that, at least in principle.

> It might make sense to have the reconnect logic in the egd chardev
> backend then, thereby obsoleting the socket reconnect patch.

Im not sure I agree there... surely there are other things which would
benefit from generic socket reconnection support (virtio-rng cant be the
only driver that might want to rely on a reliable source of data via a
socket in a server-farm type situation?)

Do we really want to re-implement reconnection (and reconnection retry
anti-flood limiting) in every single backend?

Thanks for the review - if we can nail down the reconnection issue, I'll
set about a rework of the patchset and resubmit :-)

-Ian
Jamie Lokier - April 22, 2010, 9:05 p.m.
Ian Molton wrote:
> > It might make sense to have the reconnect logic in the egd chardev
> > backend then, thereby obsoleting the socket reconnect patch.
> 
> Im not sure I agree there... surely there are other things which would
> benefit from generic socket reconnection support (virtio-rng cant be the
> only driver that might want to rely on a reliable source of data via a
> socket in a server-farm type situation?)

First of all: Why are your egd daemons with open connections dying
anyway?  Buggy egd?

Secondly: why isn't egd death an event reported over QMP, with a
monitor command to reconnect manually?

If guests need a _reliable_ source of data for security, silently not
complaining when it's gone away and hoping it comes back isn't good
enough.  It should be an error condition known to management, which
can halt the guest until egd is fixed or restarts if running without
entropy isn't acceptable in its policy.

Thirdly, which other things do you think would use it?

Maybe some virtio-serial apps would like it.

But then it would need to sync with the guest on reconnection, so that
the guest can restart whatever protocol it's using over the byte
stream.

In which case, it's better to tell the guest that the connection died,
and give the guest a way to request a new one when it's ready.

Reconnecting and resuming in the middle of the byte stram would be bad
(even for egd protocol?).  Pure /dev/urandom fetching is quite unusual in not
caring about this, but you shouldn't need to reconnect to that.

> Do we really want to re-implement reconnection (and reconnection retry
> anti-flood limiting) in every single backend?

I don't think it'll happen.  I think egd is a rather unusual

If another backend ever needs it, it's easy to move code around.

I'm not convinced there's a need for it even for egd.  Either egd
shouldn't be killing open connections (and is buggy if it is), or this
is normal egd behavior and so it's part of the egd protocol to
repeatedly reconnect, and therefore can go in the egd client code.

Meanwhile, because the egd might not return, it should be reported as
an error condition over QMP for management to do what it deems
appropriate.  In which case, management could tell it to reconnect
when it thinks is a good time, or do other things like switch the
randomness source to something else, or stop the guest, or warn the
admin that a guest is running without entropy.

-- Jamie
Gerd Hoffmann - April 23, 2010, 8:27 a.m.
Hi,

> Im not sure I agree there... surely there are other things which would
> benefit from generic socket reconnection support (virtio-rng cant be the
> only driver that might want to rely on a reliable source of data via a
> socket in a server-farm type situation?)

Usually qemu takes the server part, i.e. for serial ports you usually do 
'-serial telnet::$port,server,nowait', then 'telnet $host $port' to 
connect to your virtual serial line.  When the connection drops, just 
re-run telnet.

In my usage of qemu I didn't came across a use case which needs qemu 
reconnecting yet.

cheers,
   Gerd
Ian Molton - April 23, 2010, 9:28 a.m.
Gerd Hoffmann wrote:
>   Hi,
> 
>> Im not sure I agree there... surely there are other things which would
>> benefit from generic socket reconnection support (virtio-rng cant be the
>> only driver that might want to rely on a reliable source of data via a
>> socket in a server-farm type situation?)
> 
> Usually qemu takes the server part, i.e. for serial ports you usually do
> '-serial telnet::$port,server,nowait', then 'telnet $host $port' to
> connect to your virtual serial line.  When the connection drops, just
> re-run telnet.
>
> In my usage of qemu I didn't came across a use case which needs qemu
> reconnecting yet.

You're comparing apples with oranges :-)

That example is the opposite of whats happening in my case - qemu must
act as a client in order to connect to an EGD daemon. There is not other
choice. I'm sure one could come up with any number of cases where qemu
as a client might want to reconnect to the host.

Seriously, if virtio-rng is the only thing on qemu that acts as a socket
*client* I'd be amazed.

And really, is having the ability to reconnect to a service so terrible?

-Ian
Ian Molton - April 23, 2010, 10:17 a.m.
Jamie Lokier wrote:
> Ian Molton wrote:
>>> It might make sense to have the reconnect logic in the egd chardev
>>> backend then, thereby obsoleting the socket reconnect patch.
>> Im not sure I agree there... surely there are other things which would
>> benefit from generic socket reconnection support (virtio-rng cant be the
>> only driver that might want to rely on a reliable source of data via a
>> socket in a server-farm type situation?)
> 
> First of all: Why are your egd daemons with open connections dying
> anyway?  Buggy egd?

*AAAAAAAArgh*

Sorry but this is getting to be really annoying now.

I've answered this question at least once (more than that now, I think).

No, they aren't buggy. but occasionally, the sysadmin of said server
farms may want to, you know, update the daemon? This can and *does*
happen, and as explained before, was the actual reason WHY I implemented
reconnect.

> Secondly: why isn't egd death an event reported over QMP, with a
> monitor command to reconnect manually?

how would qemu know the socket was anything special? not everyone uses a
monitor anyway.

> If guests need a _reliable_ source of data for security, silently not
> complaining when it's gone away and hoping it comes back isn't good
> enough.

Why? its not like the guest:

a) Has a choice in the matter
b) Would carry on without the entropy (it knows it has no entropy)

> It should be an error condition known to management, which
> can halt the guest until egd is fixed or restarts if running without
> entropy isn't acceptable in its policy.

What? why? the guest itself is aware of the out-of-entropy case, just
like real hardware is aware of it. you dont halt every process on the
host just because your entropy source burnt out, the kernel already
handles things because reads on /dev/hwrhg block in this circumstance.

(and anyone using /dev/urandom for crypto directly and without
monitoring real entropy levels is an utter fool).

> Thirdly, which other things do you think would use it?

Anything which reads a socket to feed data to qemu that might
conceiveable have the remote program die / be upgraded / lose its
internet connection. Thats hundreds of possibilities (thousands even).

> Maybe some virtio-serial apps would like it.

Quite possibly.

> But then it would need to sync with the guest on reconnection, so that
> the guest can restart whatever protocol it's using over the byte
> stream.

Er, why? we're not talking about port forwarding here, we're talking
about emulation of device hardware.

From the guests point of view, losing the connection is akin to pulling
the wires out of the back of the machine, not ripping out the PCI card.

reconnection in this scenario should (and is in virtio-rng) be handled
at the device layer or higher. (the latter, in the case of
socket-reconnect support).

> In which case, it's better to tell the guest that the connection died,
> and give the guest a way to request a new one when it's ready.

Er. no. I think you've misthought there.

> Reconnecting and resuming in the middle of the byte stram would be bad
> (even for egd protocol?).

Really? because thats what happens from a hardware perspective every
time I yank an ethernet cable etc.

As long as a hook from the reconnect support down to the driver layer is
provided, the drivers can decide how to handle the reconnection.

> Pure /dev/urandom fetching is quite unusual in not
> caring about this, but you shouldn't need to reconnect to that.

EGD only needs minimal support too, as it happens.

But thats not the point. a 'fake' ethernet driver might, for example,
use the reconnect event to signal the guest that the 'physical link' is
up again. The guest would be at liberty to do what it pleased with that
information - just like if someone had yanked a real ethernet lead on
real hardware.

>> Do we really want to re-implement reconnection (and reconnection retry
>> anti-flood limiting) in every single backend?
> 
> I don't think it'll happen.  I think egd is a rather unusual
>
> If another backend ever needs it, it's easy to move code around.

*bangs head on wall*

That was the exact same argument I made about the rate limiting code.
Why is that apparently only valid if its not me that says it?

Also can I point to this email from Gerd which I got *before* I wrote
socket-reconnect support. Had I been told it'd never be merged then, I
would never have bothered writing it.

Msg-ID: <4B0E4893.9050305@redhat.com>

Perhaps now you can see why Im getting so pissed off about all this -
its not like I just wrote this shit and then demanded it be merged. I
sodding well asked first.

> I'm not convinced there's a need for it even for egd.

So what? I'm not convinced theres a need for about 90% of whats out
there, including the entire X86 architecture, which should have died out
immediately after people discovered how crap the 286 was compared to
nearly every other contemporary CPU. But it does exist, because some
people find it useful. Just because you don't like it or you would do it
differently does NOT mean there is a valid reason not to have it.

>  Either egd
> shouldn't be killing open connections (and is buggy if it is), or this
> is normal egd behavior and so it's part of the egd protocol to
> repeatedly reconnect, and therefore can go in the egd client code.

Nope, wrong on both counts (see above).

> Meanwhile, because the egd might not return, it should be reported as
> an error condition over QMP for management to do what it deems
> appropriate.  In which case, management could tell it to reconnect
> when it thinks is a good time, or do other things like switch the
> randomness source to something else, or stop the guest, or warn the
> admin that a guest is running without entropy.

I'm not averse to adding QMP support to the socket reconnect code. Seems
like it might be useful.

However its important to remember that QMP isnt in control of
everything. the EGD daemon is a system-wide thing, handled by the init
scripts and monitors, out of QMPs immediate control. It'd be best to
have both ways of doing things, IMO. (and not everyone uses a monitor).

-Ian
Gerd Hoffmann - April 23, 2010, 2:07 p.m.
On 04/23/10 11:28, Ian Molton wrote:
> Gerd Hoffmann wrote:
>>    Hi,
>>
>>> Im not sure I agree there... surely there are other things which would
>>> benefit from generic socket reconnection support (virtio-rng cant be the
>>> only driver that might want to rely on a reliable source of data via a
>>> socket in a server-farm type situation?)
>>
>> Usually qemu takes the server part, i.e. for serial ports you usually do
>> '-serial telnet::$port,server,nowait', then 'telnet $host $port' to
>> connect to your virtual serial line.  When the connection drops, just
>> re-run telnet.
>>
>> In my usage of qemu I didn't came across a use case which needs qemu
>> reconnecting yet.
>
> You're comparing apples with oranges :-)

IMHO I don't.

> That example is the opposite of whats happening in my case - qemu must
> act as a client in order to connect to an EGD daemon.

Sure.  If I have the choice I usually pick qemu taking the server role. 
  For the egd case there is no choice, qemu has to be the client.  Which 
makes egd a special case, so we could handle the special needs in the 
egd bits.

> Seriously, if virtio-rng is the only thing on qemu that acts as a socket
> *client* I'd be amazed.

You can configure any chardev to be a tcp client.  I never do that 
though as I find it much more convenient to configure it as server.

> And really, is having the ability to reconnect to a service so terrible?

No.

Lets step back.

We want:  Allow linking the egd entropy data stream to any device 
virtio-rng, isa-serial, virtio-serial, whatelse.  Which means the 
reconnect and rate limit logic should *not* sit in virtio-rng but in the 
chardev feeding the device.

Agree?

Ok, now for the chardev we have basically two choices:

   (1) Create a special egd chardev backend which handles reconnects and
       rate limiting automatically.
   (2) Add options for reconnect and rate limiting to the socket chardev
       backend.

For (1) you can (at least initially) simply hardcode sensible rates and 
reconnect retry times for egd needs.  I suspect it is the easier road 
for you.

With (2) the reconnect/rate limit bits are reusable for other use cases. 
  Which is nice indeed.  But designing the config options part will 
become a bit more tricky then, because you can't ignore possible other 
use cases then ...

cheers,
   Gerd
Ian Molton - April 23, 2010, 3:49 p.m.
On 23/04/10 15:07, Gerd Hoffmann wrote:

>>> In my usage of qemu I didn't came across a use case which needs qemu
>>> reconnecting yet.
>>
>> You're comparing apples with oranges :-)
>
> IMHO I don't.

Well, comparing a connection where qemu is the server to one where it is 
the client doesn't seem terribly valid to me...

>> That example is the opposite of whats happening in my case - qemu must
>> act as a client in order to connect to an EGD daemon.
>
> Sure. If I have the choice I usually pick qemu taking the server role.
> For the egd case there is no choice, qemu has to be the client. Which
> makes egd a special case, so we could handle the special needs in the
> egd bits.

I really don't think its that unusual...

> You can configure any chardev to be a tcp client. I never do that though
> as I find it much more convenient to configure it as server.

Perhaps thats because chardev clients are nearly useless right now 
because they just die if the connection drops...

> Lets step back.
>
> We want: Allow linking the egd entropy data stream to any device
> virtio-rng, isa-serial, virtio-serial, whatelse. Which means the
> reconnect and rate limit logic should *not* sit in virtio-rng but in the
> chardev feeding the device.
>
> Agree?

Yes.

> Ok, now for the chardev we have basically two choices:
>
> (1) Create a special egd chardev backend which handles reconnects and
> rate limiting automatically.
> (2) Add options for reconnect and rate limiting to the socket chardev
> backend.

Ok.

> For (1) you can (at least initially) simply hardcode sensible rates and
> reconnect retry times for egd needs. I suspect it is the easier road for
> you.

But the chardev layer wont reconnect for you.

Or are you suggesting that we create another type of chardev, thats 
nearly like a socket, but speaks egd and can reconnect? That seems 
hideous to me.

> With (2) the reconnect/rate limit bits are reusable for other use cases.
> Which is nice indeed. But designing the config options part will become
> a bit more tricky then, because you can't ignore possible other use
> cases then ...

But (2) is already done. Ok, the options might be 'simplistic' right 
now, but whats to say we cant add more (like "burst=foo" or something) 
should the need arise?

seriously, I had thought when you suggested it earlier that a chardev 
backend was the right way, but now I see that a backend to handle egd 
would also have to handle connecting etc. and thats just wrong. What we 
need ideally is something like a line discipline that talks egd.

That way it wouldn't matter if it were a socket or anything else that 
the data came in via, which is the case with the patch as I wrote it - 
you can feed in EGD from a file, a socket, anything, and it just works.

Rolling the EGD, reconnects, connection handling etc. all into a special 
backend would lose us that functionality, so I'm against that.

Have you actually tried the code out? It works very naturally, reuses as 
much built in qemu functionality as it can, and has a nice commandline 
syntax I think. What advantage is there in breaking that?

-Ian
Jamie Lokier - April 23, 2010, 5:32 p.m.
Ian Molton wrote:
> >You can configure any chardev to be a tcp client. I never do that though
> >as I find it much more convenient to configure it as server.
> 
> Perhaps thats because chardev clients are nearly useless right now 
> because they just die if the connection drops...

Which is why drops/missing server should be a QMP event + action, same
as other triggers like disk full and watchdog trigger.

I do not want my guests to continue running if they are configured to
depend on Qemu entropy and it's not available.

> Or are you suggesting that we create another type of chardev, thats 
> nearly like a socket, but speaks egd and can reconnect? That seems 
> hideous to me.

Why hideous?

An egd chardev is a good thing because you can then trivially
use it as a random byte source for virtio-serial, isa-serial,
pci-serial, custom-soc-serial, debug-port even :-), and anything
else which might want random bytes as Gerd said.

That's way more useful than restricting to virtio-rng, because most
guests don't support virtio at all, but they can probably all take
entropy from a serial-like device.

Similarly the ability to connect to /dev/urandom directly, with the
rate-limiting but no auto-reconnection, looking like a chardev in
the same way, would make sense.  Reconnection is not needed in this
case - missing device should be an error at startup.

Your idea for an 'egd line discipline' would need to look exactly like
a chardev internally, or all the devices which might find it useful
would have to be changed to know about line disciplines, or it just
wouldn't be available as a random byte source to everything that uses
a chardev - unnecessary limiting.

There's nothing wrong with the egd chardev actually _being
implemented_ like a line discipline on top of another chardev, with a
chardev interface so everything can use it.

In which case it's quite natural to expose the options as a
user-visible chardev 'egd', defined to return random bytes on input
and ignore output, which takes all the same options as 'socket' and
actually uses a 'socket' chardev (passing along the options).

(Is there any actual point in supporting egd over non-sockets?)

I think rate-limiting is more generically useful as a 'line
discipline'-like feature, to work with any chardev type.  But it
should then have properties governing incoming and outgoing rate
limiting separately, which won't get much testing for the only
imminent user which is input-only.

> That way it wouldn't matter if it were a socket or anything else that 
> the data came in via, which is the case with the patch as I wrote it - 
> you can feed in EGD from a file, a socket, anything, and it just works.

What's the point in feeding egd protocol from a file?
If you want entropy from a file, it should probably be raw, not egd protocol.

-- Jamie
Jamie Lokier - April 24, 2010, 1:37 a.m.
Ian Molton wrote:
> Jamie Lokier wrote:
> > First of all: Why are your egd daemons with open connections dying
> > anyway?  Buggy egd?
> 
> No, they aren't buggy. but occasionally, the sysadmin of said server
> farms may want to, you know, update the daemon?

Many daemons don't kill active connections on upgrade.  For example
sshd, telnetd, ftpd, rsyncd...  Only new connections get the new daemon.

But let's approach this from a different angle:

What do _other_ long-lived EGD clients do?  Is it:

   1. When egd is upgraded, the clients break.
   3. Active connections aren't killed on egd upgrade.
   2. They keep trying to reconnect, as you've implemented in qemu.
   4. Whenever they want entropy, they're expected to open a
      connection, request what they want, read it, and close.  Each time.

Whatever other long-lived clients do, that's probably best for qemu
too.

4 is interesting because it's an alternative approach to rate-limiting
the byte stream: Instead, fetch a batch of bytes in a single
open/read/close transaction when needed.  Rate-limit _that_, and you
don't need separate reconnection code.

So I trying checking if egd kills connections when upgraded, and found...

No 'egd' package for my Debian and Ubuntu systems, nor anything which
looks obvious.  There are several other approaches to gathering
entropy from hardware sources, for example rng-tools, haveged, ekeyd, and
ekeyd-egd-linux (aha... it's a client).

All of those have in common: they fetch entropy from something, and
write it to the kernel's /dev/random pool.  Applications are expected
to read from that pool.

In particular, if you do have a hardware or network EGD entropy
source, you can run ekeyd-egd-linux which is an EGD client, which
transfers from EGD -> the kernel, so that applications can read from
/dev/random.

That means, on Debian & Ubuntu Linux at least, there is no need for
applications to talk EGD protocol themselves, even to get network or
hardware entropy - it's better left to egd-linux, rng-tools etc. to
manage.

But the situation is no doubt different on non-Linux hosts.

By the way, ekeyd-egd-linux is a bit thoughtful: For example it has a
"shannons-per-byte" option, and it doesn't drain the EGD server at all
when the local pool is sufficiently full.

Does your EGD client + virtio-rng support do that - avoid draining the
source when the guest's pool is full enough?

> > If guests need a _reliable_ source of data for security, silently not
> > complaining when it's gone away and hoping it comes back isn't good
> > enough.
> 
> Why? its not like the guest:
> 
> a) Has a choice in the matter
> b) Would carry on without the entropy (it knows it has no entropy)

Because one might prefer a big red light, a halted machine removed
from the cluster which can resume its work when ready, and an email to
warn you that the machine isn't able to operate normally _without_
having to configure each guest's email, rather than a working machine
with increasing numbers of stuck crypto processes waiting on
/dev/random which runs out of memory and after getting into swap hell,
you have to reboot it, losing the other work that it was in the
middle of doing.

Well, you personally might not prefer that.  But that's why we
separate policy from mechanism...

> > But then it would need to sync with the guest on reconnection, so that
> > the guest can restart whatever protocol it's using over the byte
> > stream.
> 
> Er, why? we're not talking about port forwarding here, we're talking
> about emulation of device hardware.

virtio-serial isn't emulating a normal serial port.  It supports apps
like "send machine status blobs regularly", without having to be
robust against half a blob being delivered.

You can design packets so that doesn't matter, but virtio-serial
supports not needing to do that, making the apps simpler.

> > I don't think it'll happen.  I think egd is a rather unusual
> > If another backend ever needs it, it's easy to move code around.
> 
> *bangs head on wall*
> 
> That was the exact same argument I made about the rate limiting code.
> Why is that apparently only valid if its not me that says it?

Because you're talking to multiple people who hold different opinions,
and opinions change as more is learned and thought about.  It's
iterative, and I, for one, am not in a position to make merging
decisions, only give my view on it.  Can't speak for the others.

> > I'm not convinced there's a need for it even for egd.
> 
> So what? I'm not convinced theres a need for about 90% of whats out
> there,

Ah, that's not quite what I meant.  I meant I wasn't convinced it is
needed for egd, not I don't think anyone should use egd.  (But now I
see that egd-linux has a "reconnect time" option, perhaps reconnecting
_is_ de facto part of EGD protocol.)

But now that we've confirmed that on Debian & Ubuntu, all hardware
entropy sources are injected into /dev/random by userspace daemons
rather than serving EGD protocol, and if you do have an EGD server you
can run egd-linux and apps can read /dev/random, and egd-linux won't
drain the EGD server unnecessarily... are you sure EGD support is
appropriate?  Is it different on, say, Fedora?  Or are you thinking of
other hosts?

-- Jamie
Ian Molton - April 24, 2010, 8:58 a.m.
On 24/04/10 02:37, Jamie Lokier wrote:
> Ian Molton wrote:
>> Jamie Lokier wrote:
>>> First of all: Why are your egd daemons with open connections dying
>>> anyway?  Buggy egd?
>>
>> No, they aren't buggy. but occasionally, the sysadmin of said server
>> farms may want to, you know, update the daemon?
>
> Many daemons don't kill active connections on upgrade.  For example
> sshd, telnetd, ftpd, rsyncd...  Only new connections get the new daemon.

Thats actually completely irrelevant, but ok.

> But let's approach this from a different angle:

Oh god, not again...

> What do _other_ long-lived EGD clients do?  Is it:
>
>     1. When egd is upgraded, the clients break.

I'm sure some do this.

>     3. Active connections aren't killed on egd upgrade.

I don't know of any egd servers that are that nice.

>     2. They keep trying to reconnect, as you've implemented in qemu.

Some do.

>     4. Whenever they want entropy, they're expected to open a
>        connection, request what they want, read it, and close.  Each time.

Some do that too.

> Whatever other long-lived clients do, that's probably best for qemu
> too.

Well clearly thats going to be inconclusive.

> 4 is interesting because it's an alternative approach to rate-limiting
> the byte stream: Instead, fetch a batch of bytes in a single
> open/read/close transaction when needed.  Rate-limit _that_, and you
> don't need separate reconnection code.

You're effectively talking about idle-disconnect. It's not actually a 
bad idea, but we still need reconnect support in some form, in case the 
server goes away mid-fetch.

> So I trying checking if egd kills connections when upgraded, and found...
>
> No 'egd' package for my Debian and Ubuntu systems, nor anything which
> looks obvious.  There are several other approaches to gathering
> entropy from hardware sources, for example rng-tools, haveged, ekeyd, and
> ekeyd-egd-linux (aha... it's a client).

ekeyd is an EGD server.

> All of those have in common: they fetch entropy from something, and
> write it to the kernel's /dev/random pool.  Applications are expected
> to read from that pool.

Um. no. Applications *can* read that pool. Its not the only way, and 
designing apps that can ONLY do that is forcing policy.

> In particular, if you do have a hardware or network EGD entropy
> source, you can run ekeyd-egd-linux which is an EGD client, which
> transfers from EGD ->  the kernel, so that applications can read from
> /dev/random.

You *can* - but what if you have two entropy sources and you dont want 
the guests sucking down entropy from the hosts source, only their shared 
source?

> That means, on Debian&  Ubuntu Linux at least, there is no need for
> applications to talk EGD protocol themselves, even to get network or
> hardware entropy - it's better left to egd-linux, rng-tools etc. to
> manage.

And if you don't use Debian or Ubuntu, or you install your own package 
(you can use non-packaged software, and all...)

> But the situation is no doubt different on non-Linux hosts.

Indeed. /dev/random doesn't even exist on many hosts.

> By the way, ekeyd-egd-linux is a bit thoughtful: For example it has a
> "shannons-per-byte" option, and it doesn't drain the EGD server at all
> when the local pool is sufficiently full.

Indeed it is. I happen to be working with the folks that wrote it, as it 
happens.

> Does your EGD client + virtio-rng support do that - avoid draining the
> source when the guest's pool is full enough?

Actually yes, although that involves trusting the guest, which is the 
reason why my implementation has its own rate limiting - to prevent 
guest abuse of the hosts pool.

>>> If guests need a _reliable_ source of data for security, silently not
>>> complaining when it's gone away and hoping it comes back isn't good
>>> enough.
>>
>> Why? its not like the guest:
>>
>> a) Has a choice in the matter
>> b) Would carry on without the entropy (it knows it has no entropy)
>
> Because one might prefer a big red light, a halted machine removed
> from the cluster which can resume its work when ready, and an email to
> warn you that the machine isn't able to operate normally _without_
> having to configure each guest's email, rather than a working machine
> with increasing numbers of stuck crypto processes waiting on
> /dev/random which runs out of memory and after getting into swap hell,
> you have to reboot it, losing the other work that it was in the
> middle of doing.
>
> Well, you personally might not prefer that.  But that's why we
> separate policy from mechanism...

Thats something of a doomsday scenario, but hey - if you like having QMP 
support, feel free to add it to the patch - thats what open source is 
all about, right?

> virtio-serial isn't emulating a normal serial port.  It supports apps
> like "send machine status blobs regularly", without having to be
> robust against half a blob being delivered.
>
> You can design packets so that doesn't matter, but virtio-serial
> supports not needing to do that, making the apps simpler.

Have you actually read the code? virtio-rng is FAR simpler than 
virtio-serial.

>>> I don't think it'll happen.  I think egd is a rather unusual
>>> If another backend ever needs it, it's easy to move code around.
>>
>> *bangs head on wall*
>>
>> That was the exact same argument I made about the rate limiting code.
>> Why is that apparently only valid if its not me that says it?
>
> Because you're talking to multiple people who hold different opinions,
> and opinions change as more is learned and thought about.

Round in circles, apparently. This is getting on for the fourth time 
round...

> Ah, that's not quite what I meant.  I meant I wasn't convinced it is
> needed for egd, not I don't think anyone should use egd.  (But now I
> see that egd-linux has a "reconnect time" option, perhaps reconnecting
> _is_ de facto part of EGD protocol.)

Actually EGD is very much an ad-hoc standard I'm afraid. But it does 
exist, and my implementation both works, and is compliant with the 
standard, such as it is.

> But now that we've confirmed that on Debian&  Ubuntu, all hardware
> entropy sources are injected into /dev/random by userspace daemons
> rather than serving EGD protocol,

No, we know that they *can be* not that they are, and not even by 
default - you actually have to elect to install a package to do that.

> and if you do have an EGD server you
> can run egd-linux and apps can read /dev/random, and egd-linux won't
> drain the EGD server unnecessarily...

If you want the entropy to enter the *hosts* pool then sure - but I 
thought we weren't about forcing policy on the users?

 > are you sure EGD support is appropriate?

Yes. Its what the users want. Its not broken or inefficient. Therefore 
its appropriate.

> Is it different on, say, Fedora?  Or are you thinking of
> other hosts?

There do exist hosts with no /dev/random as it happens.

-Ian
Ian Molton - April 24, 2010, 9:16 a.m.
On 23/04/10 18:32, Jamie Lokier wrote:
> Ian Molton wrote:
>>> You can configure any chardev to be a tcp client. I never do that though
>>> as I find it much more convenient to configure it as server.
>>
>> Perhaps thats because chardev clients are nearly useless right now
>> because they just die if the connection drops...
>
> Which is why drops/missing server should be a QMP event + action, same
> as other triggers like disk full and watchdog trigger.

So write the code then. I neither want nor care about that feature, so 
why should I write it?

> I do not want my guests to continue running if they are configured to
> depend on Qemu entropy and it's not available.

Realistically though, causes of EGD dissapearing...

1) the daemon dies
2) the daemon is replaced

in the case of 1) then potentially the host, and *every single* guest 
will run out of entropy. You only need one flashing red light, and the 
host can tell you that as well as any of the guests.

Entropy starvation doesnt usually lead to the catastrophic failure you 
describe - 99.9999% of linux machines have no entropy source configured 
and they get along just fine - even with process address space 
randomisation enabled.

In the case of 2) then the downtime is so short that a warning would be 
inappropriate, and the sysadmin would be *right there* anyway.

>> Or are you suggesting that we create another type of chardev, thats
>> nearly like a socket, but speaks egd and can reconnect? That seems
>> hideous to me.
>
> Why hideous?

Because automatic reconection should not apply to only one type of 
socket, but all. Why deliberately cripple a feature and make it specific 
when one does not have to?

> An egd chardev is a good thing because you can then trivially
> use it as a random byte source for virtio-serial, isa-serial,
> pci-serial, custom-soc-serial, debug-port even :-), and anything
> else which might want random bytes as Gerd said.

Yes, but qemu (AFAICT) does not support "line disciplines" so we can 
either have

chrdev_egd+reconnect  ->  virtio-* backend

or

chrdev_socket+reconnect -> virtio-rng+egd backend


IOW, we can either have generic reconnect support, or generic egd 
support, but not both.

Ideally, we'd have line disciplines and it'd be more like this:

chrdev_socket+reconnect -> egd ldisc -> virtio-* backend

So, given that we dont have line disciplines (and I'm not about to write 
more code that is just going to get squabbled over for months) that 
leaves us with a choice to make.

I personally say that generic EGD support is less useful than generic 
socket reconnect support, since at a pinch, and as others have pointed 
out, EGD can be implemented externally to qemu.

Socket reconnection _cannot_ be implemented externally, and frankly, I 
consider it a bug that qemu basically presents an option (client 
sockets) that once it dies, the user has no way of re-instating other 
than to kill and restart qemu.

> That's way more useful than restricting to virtio-rng, because most
> guests don't support virtio at all, but they can probably all take
> entropy from a serial-like device.

a) The kind of guest that cant do virtio is damn unlikely to be in use 
in a server-farm, and

b) Nothing stops someone running an egd->serial converter on the host in 
order to feed a serial-like device.

> Similarly the ability to connect to /dev/urandom directly, with the
> rate-limiting but no auto-reconnection, looking like a chardev in
> the same way, would make sense.  Reconnection is not needed in this
> case - missing device should be an error at startup.

My current code will happily read entropy from a file, a char device or 
a socket. It can *also* speak EGD on bidirectional connections like sockets.

> Your idea for an 'egd line discipline' would need to look exactly like
> a chardev internally

Well of course it would. Thats what line disciplines are supposed to do. 
(look at their analogue in the linux tty code)

> In which case it's quite natural to expose the options as a
> user-visible chardev 'egd', defined to return random bytes on input
> and ignore output, which takes all the same options as 'socket' and
> actually uses a 'socket' chardev (passing along the options).

Unless I'm missing something, qemu cannot chain multiple chrdevs 
together. Which means that the hypothetical egd chardev would have to 
duplicate 100% of the socket chrdev logic.

> I think rate-limiting is more generically useful as a 'line
> discipline'-like feature, to work with any chardev type.  But it
> should then have properties governing incoming and outgoing rate
> limiting separately, which won't get much testing for the only
> imminent user which is input-only.

but again, qemu doesn't *have* line disciplines.

-Ian
Anthony Liguori - May 3, 2010, 5:56 p.m.
On 03/30/2010 09:13 AM, Ian Molton wrote:
>
> > From 5f484301d73fa53009bbcd430f8ae85868b67772 Mon Sep 17 00:00:00 2001
> From: Ian Molton<ian.molton@collabora.co.uk>
> Date: Tue, 17 Nov 2009 14:34:12 +0000
> Subject: [PATCH 2/4] virtio: Add virtio-rng driver
>
> 	This patch adds support for virtio-rng. Data is read from a chardev and
> can be either raw entropy or received via the EGD protocol.
>
> Signed-off-by: Ian Molton<ian.molton@collabora.co.uk>
>    

Hi Ian,

A couple review comments below.  I've read the thread.  virtio-rng 
should stand on it's own as it's a spec'd out virtio device so I'm with 
you there.

The SIZE qdev patch can't be applied until there's a consumer of it.  
I've already expressed my opinion about the socket reconnect patch.  I 
don't think it's the right functionality for qemu to implement because 
the semantics concern me.  Really, a more thorough refactoring of 
CharDrivers is needed such that it supported connected and disconnected 
states along with QMP events for the transitions.

I agree with Paul that the EGD logic ought to be separated.  It probably 
makes sense to modify this a bit to have a split front-end/back-end.  
For instance:

-device virtio-rng-pci,rngdev=foo -rngdev egd,addr=unix:/path/to/rng,id=foo

You could also have a syntax helper like:

-rng egd,addr:/path/to/rng

See the recent virtfs patch series for an example of this.

With those changes, I'd happily apply this series.

> ---
>   Makefile.target |    2 +-
>   hw/pci.h        |    1 +
>   hw/virtio-pci.c |   29 ++++++++
>   hw/virtio-rng.c |  195
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>   hw/virtio-rng.h |   19 ++++++
>   hw/virtio.h     |    2 +
>   rng.h           |   14 ++++
>   7 files changed, 261 insertions(+), 1 deletions(-)
>   create mode 100644 hw/virtio-rng.c
>   create mode 100644 hw/virtio-rng.h
>   create mode 100644 rng.h
>
> diff --git a/Makefile.target b/Makefile.target
> index 16ea443..2f9c929 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -166,7 +166,7 @@ obj-y += qemu-timer.o
>   # virtio has to be here due to weird dependency between PCI and virtio-net.
>   # need to fix this properly
>   obj-y += virtio-blk.o virtio-balloon.o virtio-net.o virtio-serial-bus.o
> -obj-y += rwhandler.o
> +obj-y += virtio-rng.o rwhandler.o
>   obj-$(CONFIG_KVM) += kvm.o kvm-all.o
>   LIBS+=-lz
>
> diff --git a/hw/pci.h b/hw/pci.h
> index 20c670e..dc61cb5 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -69,6 +69,7 @@
>   #define PCI_DEVICE_ID_VIRTIO_BLOCK       0x1001
>   #define PCI_DEVICE_ID_VIRTIO_BALLOON     0x1002
>   #define PCI_DEVICE_ID_VIRTIO_CONSOLE     0x1003
> +#define PCI_DEVICE_ID_VIRTIO_RNG         0x1004
>
>   #define FMT_PCIBUS                      PRIx64
>
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index 6eb19cd..bee3105 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -23,6 +23,7 @@
>   #include "msix.h"
>   #include "net.h"
>   #include "block_int.h"
> +#include "rng.h"
>   #include "loader.h"
>
>   /* from Linux's linux/virtio_pci.h */
> @@ -95,6 +96,7 @@ typedef struct {
>       uint32_t nvectors;
>       BlockConf block;
>       NICConf nic;
> +    RNGConf rng;
>       uint32_t host_features;
>       /* Max. number of ports we can have for a the virtio-serial device */
>       uint32_t max_virtserial_ports;
> @@ -552,6 +554,21 @@ static int virtio_balloon_init_pci(PCIDevice *pci_dev)
>       return 0;
>   }
>
> +static int virtio_rng_init_pci(PCIDevice *pci_dev)
> +{
> +    VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
> +    VirtIODevice *vdev;
> +
> +    vdev = virtio_rng_init(&pci_dev->qdev,&proxy->rng);
> +    virtio_init_pci(proxy, vdev,
> +                    PCI_VENDOR_ID_REDHAT_QUMRANET,
> +                    PCI_DEVICE_ID_VIRTIO_RNG,
> +                    PCI_CLASS_OTHERS,
> +                    0x00);
> +
> +    return 0;
> +}
> +
>   static PCIDeviceInfo virtio_info[] = {
>       {
>           .qdev.name = "virtio-blk-pci",
> @@ -606,6 +623,18 @@ static PCIDeviceInfo virtio_info[] = {
>           },
>           .qdev.reset = virtio_pci_reset,
>       },{
> +        .qdev.name = "virtio-rng-pci",
> +        .qdev.size = sizeof(VirtIOPCIProxy),
> +        .init      = virtio_rng_init_pci,
> +        .exit      = virtio_exit_pci,
> +        .qdev.props = (Property[]) {
> +            DEFINE_PROP_CHR("chardev",     VirtIOPCIProxy, rng.chrdev),
> +            DEFINE_PROP_SIZE("rate",       VirtIOPCIProxy, rng.rate, 0),
> +            DEFINE_PROP_STRING("protocol", VirtIOPCIProxy, rng.proto),
> +            DEFINE_PROP_END_OF_LIST(),
> +        },
> +        .qdev.reset = virtio_pci_reset,
> +    },{
>           /* end of list */
>       }
>   };
> diff --git a/hw/virtio-rng.c b/hw/virtio-rng.c
> new file mode 100644
> index 0000000..154ea76
> --- /dev/null
> +++ b/hw/virtio-rng.c
> @@ -0,0 +1,195 @@
> +/*
> + * Virtio RNG Device
> + *
> + * Copyright Collabora 2009
> + *
> + * Authors:
> + *  Ian Molton<ian.molton@collabora.co.uk>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "hw.h"
> +#include "qemu-char.h"
> +#include "virtio.h"
> +#include "virtio-rng.h"
> +#include "rng.h"
> +#include<sys/time.h>
> +
> +typedef struct VirtIORng
> +{
> +    VirtIODevice vdev;
> +    VirtQueue *vq;
> +    CharDriverState *chr;
> +    struct timeval last;
> +    int rate;
> +    int egd;
> +    int entropy_remaining;
> +    int pool;
> +} VirtIORng;
> +
> +/* Maximum size of the buffer the guest expects */
> +#define BUFF_MAX 64
> +
> +/* EGD protocol - we only use this command */
> +#define EGD_READ_BLOCK 0x2
> +
> +#define EGD_MAX_BLOCK_SIZE 255
> +#define EGD_MAX_REQUESTS 3
> +#define EGD_MAX_POOL_SIZE (EGD_MAX_BLOCK_SIZE * (EGD_MAX_REQUESTS-1))
> +
> +static inline void req_entropy(VirtIORng *s)
> +{
> +    static const unsigned char entropy_rq[2] = { EGD_READ_BLOCK,
> +                                                 EGD_MAX_BLOCK_SIZE };
> +    if (s->egd) {
> +        /* Let the socket buffer up the incoming data for us. Max block
> size
> +           for EGD protocol is (stupidly) 255, so make sure we always
> have a
> +           block pending for performance. We can have 3 outstanding
> buffers */
>    

Whitespace damaged (broken mailer?).

> +        if (s->pool<= EGD_MAX_POOL_SIZE) {
> +            s->chr->chr_write(s->chr, entropy_rq, sizeof(entropy_rq));
> +            s->pool += EGD_MAX_BLOCK_SIZE;
> +        }
> +    } else {
> +        s->pool = BUFF_MAX;
> +    }
> +}
> +
> +static int vrng_can_read(void *opaque)
> +{
> +    VirtIORng *s = (VirtIORng *) opaque;
> +    struct timeval now, d;
> +    int max_entropy;
> +
> +    if (!virtio_queue_ready(s->vq) ||
> +        !(s->vdev.status&  VIRTIO_CONFIG_S_DRIVER_OK) ||
> +        virtio_queue_empty(s->vq))
> +        return 0;
>    

Missing braces (CODING_STYLE).

> +    req_entropy(s);
> +
> +    if (s->rate) {
> +        qemu_gettimeofday(&now);
> +        timersub(&now,&s->last,&d);
> +        if (d.tv_sec * 1000000 + d.tv_usec>  1000000) {
> +            s->entropy_remaining = s->rate;
> +            s->last = now;
> +        }
>    

Should be based on rt_clock

> +        max_entropy = MIN(s->pool, s->entropy_remaining);
> +    } else {
> +        max_entropy = s->pool;
> +    }
> +
> +    /* current implementations have a 64 byte buffer.
> +     * We fall back to a one byte per read if there is not enough room.
> +     */
> +    max_entropy = MIN(max_entropy, BUFF_MAX);
> +    if (max_entropy) {
> +        if (virtqueue_avail_bytes(s->vq, max_entropy, 0))
> +            return max_entropy;
> +        if (virtqueue_avail_bytes(s->vq, 1, 0))
> +            return 1;
>    

Missing braces (CODING_STYLE).

> +    }
> +    return 0;
> +}
> +
> +static void vrng_read(void *opaque, const uint8_t *buf, int size)
> +{
> +    VirtIORng *s = (VirtIORng *) opaque;
> +    VirtQueueElement elem;
> +    int offset = 0;
> +
> +    /* The current kernel implementation has only one outstanding input
> +     * buffer of 64 bytes.
> +     */
> +    while (offset<  size) {
> +        int i = 0;
> +        if (!virtqueue_pop(s->vq,&elem))
> +                break;
>    

Braces.

> +        while (offset<  size&&  i<  elem.in_num) {
> +            int len = MIN(elem.in_sg[i].iov_len, size - offset);
> +            memcpy(elem.in_sg[i].iov_base, buf + offset, len);
> +            offset += len;
> +            i++;
> +        }
> +        virtqueue_push(s->vq,&elem, size);
> +    }
> +
> +    if (s->rate)
> +        s->entropy_remaining -= size;
>    

Braces (lots more too).

> +    s->pool -= size;
> +
> +    virtio_notify(&s->vdev, s->vq);
> +}
> +
> +static void vrng_event(void *opaque, int event)
> +{
> +
> +}
> +
> +
> +
> +static void virtio_rng_handle(VirtIODevice *vdev, VirtQueue *vq)
> +{
> +    /* Nothing to do - we push data when its available */
> +}
> +
> +static uint32_t virtio_rng_get_features(VirtIODevice *vdev, uint32_t
> features)
> +{
> +    return 0;
> +}
> +
> +static void virtio_rng_save(QEMUFile *f, void *opaque)
> +{
> +    VirtIORng *s = opaque;
> +
> +    virtio_save(&s->vdev, f);
> +}
> +
> +static int virtio_rng_load(QEMUFile *f, void *opaque, int version_id)
> +{
> +    VirtIORng *s = opaque;
> +
> +    if (version_id != 1)
> +        return -EINVAL;
> +
> +    virtio_load(&s->vdev, f);
> +    return 0;
> +}
> +
> +VirtIODevice *virtio_rng_init(DeviceState *dev, RNGConf *rngdev)
> +{
> +    VirtIORng *s;
> +    s = (VirtIORng *)virtio_common_init("virtio-rng",
> +                                            VIRTIO_ID_RNG,
> +                                            0, sizeof(VirtIORng));
> +
> +    if (!s)
> +        return NULL;
> +
> +    s->vdev.get_features = virtio_rng_get_features;
> +
> +    s->vq = virtio_add_queue(&s->vdev, 128, virtio_rng_handle);
> +    s->chr = rngdev->chrdev;
> +    s->rate = rngdev->rate;
> +    gettimeofday(&s->last, NULL);
> +
> +    if(rngdev->proto&&  !strncmp(rngdev->proto, "egd", 3))
> +        s->egd = 1;
> +
> +#ifdef DEBUG
>    

It's more typical to introduce a debugging macro (like dprintf()) and 
then have a single define at the top of the file to enable the macro.

This will make conversion to a more unified logging mechanism a lot 
easier down the road.

Regards,

Anthony Liguori

> +    printf("entropy being read from %s", rngdev->chrdev->label);
> +    if(s->rate)
> +        printf(" at %d bytes/sec max.", s->rate);
> +    printf(" protocol: %s\n", s->egd?"egd":"raw");
> +#endif
> +
> +    qemu_chr_add_handlers(s->chr, vrng_can_read, vrng_read, vrng_event, s);
> +
> +    register_savevm("virtio-rng", -1, 1, virtio_rng_save,
> virtio_rng_load, s);
> +
> +    return&s->vdev;
> +}
> +
> diff --git a/hw/virtio-rng.h b/hw/virtio-rng.h
> new file mode 100644
> index 0000000..bc4d2d8
> --- /dev/null
> +++ b/hw/virtio-rng.h
> @@ -0,0 +1,19 @@
> +/*
> + * Virtio RNG Support
> + *
> + * Copyright Collabora 2009
> + *
> + * Authors:
> + *  Ian Molton<ian.molton@collabora.co.uk>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + */
> +#ifndef _QEMU_VIRTIO_RNG_H
> +#define _QEMU_VIRTIO_RNG_H
> +
> +/* The ID for virtio console */
> +#define VIRTIO_ID_RNG 4
> +
> +#endif
> diff --git a/hw/virtio.h b/hw/virtio.h
> index 3baa2a3..749b2a6 100644
> --- a/hw/virtio.h
> +++ b/hw/virtio.h
> @@ -16,6 +16,7 @@
>
>   #include "hw.h"
>   #include "net.h"
> +#include "rng.h"
>   #include "qdev.h"
>   #include "sysemu.h"
>   #include "block_int.h"
> @@ -174,6 +175,7 @@ VirtIODevice *virtio_blk_init(DeviceState *dev,
> BlockConf *conf);
>   VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf);
>   VirtIODevice *virtio_serial_init(DeviceState *dev, uint32_t max_nr_ports);
>   VirtIODevice *virtio_balloon_init(DeviceState *dev);
> +VirtIODevice *virtio_rng_init(DeviceState *dev, RNGConf *rngdev);
>
>   void virtio_net_exit(VirtIODevice *vdev);
>
> diff --git a/rng.h b/rng.h
> new file mode 100644
> index 0000000..14ae7e0
> --- /dev/null
> +++ b/rng.h
> @@ -0,0 +1,14 @@
> +#ifndef QEMU_RNG_H
> +#define QEMU_RNG_H
> +
> +#include "qemu-option.h"
> +
> +/* qdev rng properties */
> +
> +typedef struct RNGConf {
> +    CharDriverState *chrdev;
> +    uint64_t rate;
> +    char *proto;
> +} RNGConf;
> +
> +#endif
>

Patch

diff --git a/Makefile.target b/Makefile.target
index 16ea443..2f9c929 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -166,7 +166,7 @@  obj-y += qemu-timer.o
 # virtio has to be here due to weird dependency between PCI and virtio-net.
 # need to fix this properly
 obj-y += virtio-blk.o virtio-balloon.o virtio-net.o virtio-serial-bus.o
-obj-y += rwhandler.o
+obj-y += virtio-rng.o rwhandler.o
 obj-$(CONFIG_KVM) += kvm.o kvm-all.o
 LIBS+=-lz

diff --git a/hw/pci.h b/hw/pci.h
index 20c670e..dc61cb5 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -69,6 +69,7 @@ 
 #define PCI_DEVICE_ID_VIRTIO_BLOCK       0x1001
 #define PCI_DEVICE_ID_VIRTIO_BALLOON     0x1002
 #define PCI_DEVICE_ID_VIRTIO_CONSOLE     0x1003
+#define PCI_DEVICE_ID_VIRTIO_RNG         0x1004

 #define FMT_PCIBUS                      PRIx64

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 6eb19cd..bee3105 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -23,6 +23,7 @@ 
 #include "msix.h"
 #include "net.h"
 #include "block_int.h"
+#include "rng.h"
 #include "loader.h"

 /* from Linux's linux/virtio_pci.h */
@@ -95,6 +96,7 @@  typedef struct {
     uint32_t nvectors;
     BlockConf block;
     NICConf nic;
+    RNGConf rng;
     uint32_t host_features;
     /* Max. number of ports we can have for a the virtio-serial device */
     uint32_t max_virtserial_ports;
@@ -552,6 +554,21 @@  static int virtio_balloon_init_pci(PCIDevice *pci_dev)
     return 0;
 }

+static int virtio_rng_init_pci(PCIDevice *pci_dev)
+{
+    VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
+    VirtIODevice *vdev;
+
+    vdev = virtio_rng_init(&pci_dev->qdev, &proxy->rng);
+    virtio_init_pci(proxy, vdev,
+                    PCI_VENDOR_ID_REDHAT_QUMRANET,
+                    PCI_DEVICE_ID_VIRTIO_RNG,
+                    PCI_CLASS_OTHERS,
+                    0x00);
+
+    return 0;
+}
+
 static PCIDeviceInfo virtio_info[] = {
     {
         .qdev.name = "virtio-blk-pci",
@@ -606,6 +623,18 @@  static PCIDeviceInfo virtio_info[] = {
         },
         .qdev.reset = virtio_pci_reset,
     },{
+        .qdev.name = "virtio-rng-pci",
+        .qdev.size = sizeof(VirtIOPCIProxy),
+        .init      = virtio_rng_init_pci,
+        .exit      = virtio_exit_pci,
+        .qdev.props = (Property[]) {
+            DEFINE_PROP_CHR("chardev",     VirtIOPCIProxy, rng.chrdev),
+            DEFINE_PROP_SIZE("rate",       VirtIOPCIProxy, rng.rate, 0),
+            DEFINE_PROP_STRING("protocol", VirtIOPCIProxy, rng.proto),
+            DEFINE_PROP_END_OF_LIST(),
+        },
+        .qdev.reset = virtio_pci_reset,
+    },{
         /* end of list */
     }
 };
diff --git a/hw/virtio-rng.c b/hw/virtio-rng.c
new file mode 100644
index 0000000..154ea76
--- /dev/null
+++ b/hw/virtio-rng.c
@@ -0,0 +1,195 @@ 
+/*
+ * Virtio RNG Device
+ *
+ * Copyright Collabora 2009
+ *
+ * Authors:
+ *  Ian Molton <ian.molton@collabora.co.uk>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw.h"
+#include "qemu-char.h"
+#include "virtio.h"
+#include "virtio-rng.h"
+#include "rng.h"
+#include <sys/time.h>
+
+typedef struct VirtIORng
+{
+    VirtIODevice vdev;
+    VirtQueue *vq;
+    CharDriverState *chr;
+    struct timeval last;
+    int rate;
+    int egd;
+    int entropy_remaining;
+    int pool;
+} VirtIORng;
+
+/* Maximum size of the buffer the guest expects */
+#define BUFF_MAX 64
+
+/* EGD protocol - we only use this command */
+#define EGD_READ_BLOCK 0x2
+
+#define EGD_MAX_BLOCK_SIZE 255
+#define EGD_MAX_REQUESTS 3
+#define EGD_MAX_POOL_SIZE (EGD_MAX_BLOCK_SIZE * (EGD_MAX_REQUESTS-1))
+
+static inline void req_entropy(VirtIORng *s)
+{
+    static const unsigned char entropy_rq[2] = { EGD_READ_BLOCK,
+                                                 EGD_MAX_BLOCK_SIZE };
+    if (s->egd) {
+        /* Let the socket buffer up the incoming data for us. Max block
size
+           for EGD protocol is (stupidly) 255, so make sure we always
have a
+           block pending for performance. We can have 3 outstanding
buffers */
+        if (s->pool <= EGD_MAX_POOL_SIZE) {
+            s->chr->chr_write(s->chr, entropy_rq, sizeof(entropy_rq));
+            s->pool += EGD_MAX_BLOCK_SIZE;
+        }
+    } else {
+        s->pool = BUFF_MAX;
+    }
+}
+
+static int vrng_can_read(void *opaque)
+{
+    VirtIORng *s = (VirtIORng *) opaque;
+    struct timeval now, d;
+    int max_entropy;
+
+    if (!virtio_queue_ready(s->vq) ||
+        !(s->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK) ||
+        virtio_queue_empty(s->vq))
+        return 0;
+
+    req_entropy(s);
+
+    if (s->rate) {
+        qemu_gettimeofday(&now);
+        timersub(&now, &s->last, &d);
+        if (d.tv_sec * 1000000 + d.tv_usec > 1000000) {
+            s->entropy_remaining = s->rate;
+            s->last = now;
+        }
+        max_entropy = MIN(s->pool, s->entropy_remaining);
+    } else {
+        max_entropy = s->pool;
+    }
+
+    /* current implementations have a 64 byte buffer.
+     * We fall back to a one byte per read if there is not enough room.
+     */
+    max_entropy = MIN(max_entropy, BUFF_MAX);
+    if (max_entropy) {
+        if (virtqueue_avail_bytes(s->vq, max_entropy, 0))
+            return max_entropy;
+        if (virtqueue_avail_bytes(s->vq, 1, 0))
+            return 1;
+    }
+    return 0;
+}
+
+static void vrng_read(void *opaque, const uint8_t *buf, int size)
+{
+    VirtIORng *s = (VirtIORng *) opaque;
+    VirtQueueElement elem;
+    int offset = 0;
+
+    /* The current kernel implementation has only one outstanding input
+     * buffer of 64 bytes.
+     */
+    while (offset < size) {
+        int i = 0;
+        if (!virtqueue_pop(s->vq, &elem))
+                break;
+        while (offset < size && i < elem.in_num) {
+            int len = MIN(elem.in_sg[i].iov_len, size - offset);
+            memcpy(elem.in_sg[i].iov_base, buf + offset, len);
+            offset += len;
+            i++;
+        }
+        virtqueue_push(s->vq, &elem, size);
+    }
+
+    if (s->rate)
+        s->entropy_remaining -= size;
+    s->pool -= size;
+
+    virtio_notify(&s->vdev, s->vq);
+}
+
+static void vrng_event(void *opaque, int event)
+{
+
+}
+
+
+
+static void virtio_rng_handle(VirtIODevice *vdev, VirtQueue *vq)
+{
+    /* Nothing to do - we push data when its available */
+}
+
+static uint32_t virtio_rng_get_features(VirtIODevice *vdev, uint32_t
features)
+{
+    return 0;
+}
+
+static void virtio_rng_save(QEMUFile *f, void *opaque)
+{
+    VirtIORng *s = opaque;
+
+    virtio_save(&s->vdev, f);
+}
+
+static int virtio_rng_load(QEMUFile *f, void *opaque, int version_id)
+{
+    VirtIORng *s = opaque;
+
+    if (version_id != 1)
+        return -EINVAL;
+
+    virtio_load(&s->vdev, f);
+    return 0;
+}
+
+VirtIODevice *virtio_rng_init(DeviceState *dev, RNGConf *rngdev)
+{
+    VirtIORng *s;
+    s = (VirtIORng *)virtio_common_init("virtio-rng",
+                                            VIRTIO_ID_RNG,
+                                            0, sizeof(VirtIORng));
+
+    if (!s)
+        return NULL;
+
+    s->vdev.get_features = virtio_rng_get_features;
+
+    s->vq = virtio_add_queue(&s->vdev, 128, virtio_rng_handle);
+    s->chr = rngdev->chrdev;
+    s->rate = rngdev->rate;
+    gettimeofday(&s->last, NULL);
+
+    if(rngdev->proto && !strncmp(rngdev->proto, "egd", 3))
+        s->egd = 1;
+
+#ifdef DEBUG
+    printf("entropy being read from %s", rngdev->chrdev->label);
+    if(s->rate)
+        printf(" at %d bytes/sec max.", s->rate);
+    printf(" protocol: %s\n", s->egd?"egd":"raw");
+#endif
+
+    qemu_chr_add_handlers(s->chr, vrng_can_read, vrng_read, vrng_event, s);
+
+    register_savevm("virtio-rng", -1, 1, virtio_rng_save,
virtio_rng_load, s);
+
+    return &s->vdev;
+}
+
diff --git a/hw/virtio-rng.h b/hw/virtio-rng.h
new file mode 100644
index 0000000..bc4d2d8
--- /dev/null
+++ b/hw/virtio-rng.h
@@ -0,0 +1,19 @@ 
+/*
+ * Virtio RNG Support
+ *
+ * Copyright Collabora 2009
+ *
+ * Authors:
+ *  Ian Molton <ian.molton@collabora.co.uk>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+#ifndef _QEMU_VIRTIO_RNG_H
+#define _QEMU_VIRTIO_RNG_H
+
+/* The ID for virtio console */
+#define VIRTIO_ID_RNG 4
+
+#endif
diff --git a/hw/virtio.h b/hw/virtio.h
index 3baa2a3..749b2a6 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -16,6 +16,7 @@ 

 #include "hw.h"
 #include "net.h"
+#include "rng.h"
 #include "qdev.h"
 #include "sysemu.h"
 #include "block_int.h"
@@ -174,6 +175,7 @@  VirtIODevice *virtio_blk_init(DeviceState *dev,
BlockConf *conf);
 VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf);
 VirtIODevice *virtio_serial_init(DeviceState *dev, uint32_t max_nr_ports);
 VirtIODevice *virtio_balloon_init(DeviceState *dev);
+VirtIODevice *virtio_rng_init(DeviceState *dev, RNGConf *rngdev);

 void virtio_net_exit(VirtIODevice *vdev);

diff --git a/rng.h b/rng.h
new file mode 100644
index 0000000..14ae7e0
--- /dev/null
+++ b/rng.h
@@ -0,0 +1,14 @@ 
+#ifndef QEMU_RNG_H
+#define QEMU_RNG_H
+
+#include "qemu-option.h"
+
+/* qdev rng properties */
+
+typedef struct RNGConf {
+    CharDriverState *chrdev;
+    uint64_t rate;
+    char *proto;
+} RNGConf;
+
+#endif