Message ID | 1353342180-20392-1-git-send-email-pbonzini@redhat.com |
---|---|
State | New |
Headers | show |
Paolo Bonzini <pbonzini@redhat.com> wrote: > buffered_rate_limit is called to prevent the RAM migration callback > from putting too much data in the buffer. So it has to check against > the amount of data currently in the buffer, not against the amount > of data that has been transferred so far. > > s->bytes_xfer is used to communicate between successive calls of > buffered_put_buffer. Buffered_rate_tick resets it every now and > then to prevent moving too much buffered data to the socket at > once. However, its value does not matter for the producer of the > data. > > Here is the result for migrating an idle guest with 3GB of memory > and ~360MB of non-zero memory: > > migrate_set_speed Before After > ------------------------------------------------------------ > default (32MB/sec) ~3 sec ~13 sec > infinite (10GB/sec) ~3 sec ~3 sec > > Note that before this patch, QEMU is transferring of 100 MB/sec > despite the rate limiting. This patch is wrong O;-) That don't mean that current code is right. We have 4 variables: - xfer_limit: how much we are allowed to send each 100ms (in bytes) - buffer_capacity: the size of the buffer that we are using - buffer_size: the amount of the previous buffer that we are using buffer_size < buffer_capacity, or we are doing something wrong. - bytes_xfer: How many bytes we have transfered since last 100ms timer. And how this work: - we have an input handler that copies stuff from RAM to buffer it stops when we have sent more that xfer_limit on this period (each 100ms) - we have another handler that is run each 100ms, and this one sent anything that is on the buffer, and reset bytes_xfer to zero. WHat you have done is just telling that in the 1st input handler, that we always have size on the buffer for it, so that we are not doing any rate limiting at all. Note that the migration_thread code changes all this code, and there, the statistics are handled much better. It is very strange that buffer_capacity is bigger that xfer_limit (it could happen, but it is very unusual), and then what you have done there is just disabling rate limiting altogether. So .... NACK > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > buffered_file.c | 2 +- > 1 file modificato, 1 inserzione(+). 1 rimozione(-) > > diff --git a/buffered_file.c b/buffered_file.c > index bd0f61d..46cd591 100644 > --- a/buffered_file.c > +++ b/buffered_file.c > @@ -193,7 +193,7 @@ static int buffered_rate_limit(void *opaque) > if (s->freeze_output) > return 1; > > - if (s->bytes_xfer > s->xfer_limit) o> + if (s->buffer_size > s->xfer_limit) > return 1; > > return 0;
Il 20/11/2012 12:03, Juan Quintela ha scritto: > This patch is wrong O;-) > That don't mean that current code is right. > > We have 4 variables: > - xfer_limit: how much we are allowed to send each 100ms (in bytes) > - buffer_capacity: the size of the buffer that we are using > - buffer_size: the amount of the previous buffer that we are using > buffer_size < buffer_capacity, or we are doing something > wrong. > - bytes_xfer: How many bytes we have transfered since last 100ms timer. Note that the buffered_file places a wall between producer and consumer (or rather tries to place it; my patch is an attempt to fix). The consumer side is buffered_flush & bytes_xfer, and the rate-limiting there is mandatory. However, on the producer side, the rate-limiting is only advisory, to avoid making the buffer_capacity too large. In principle (and leaving MAX_WAIT aside for a moment) you could skip qemu_file_rate_limit calls completely. It would place the whole RAM in the buffer, and still transfer it in xfer_limit chunks on the wire. > And how this work: > - we have an input handler that copies stuff from RAM to buffer > it stops when we have sent more that xfer_limit on this period (each > 100ms) > - we have another handler that is run each 100ms, and this one sent > anything that is on the buffer, and reset bytes_xfer to zero. > > WHat you have done is just telling that in the 1st input handler, that > we always have size on the buffer for it, so that we are not doing any > rate limiting at all. > > It is very strange that buffer_capacity is bigger that xfer_limit (it > could happen, but it is very unusual), and then what you have done there > is just disabling rate limiting altogether. I haven't: once s->bytes_xfer >= s->xfer_limit, buffered_flush will not do anything, and data will accumulate in the buffer until bytes_xfer is moved back to 0. So what happens really is that ram_save_block is already preparing the next chunk of data to send, but only s->xfer_limit bytes are sent on each 100ms period. However, my patch was incomplete. The desired behavior is that buffered_put_buffer(s, NULL, 0, 0) will restart the iteration, so it has to check buffer_size too. In fact, the check in buffered_put_b There is also another bug in the current code, which is an off-by-one. Comparison is s->bytes_xfer > s->xfer_limit, but it should be >= instead. And more somewhat broken checks. I'm testing a more complete fix. Paolo > > So .... > > NACK > >> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> >> --- >> buffered_file.c | 2 +- >> 1 file modificato, 1 inserzione(+). 1 rimozione(-) >> >> diff --git a/buffered_file.c b/buffered_file.c >> index bd0f61d..46cd591 100644 >> --- a/buffered_file.c >> +++ b/buffered_file.c >> @@ -193,7 +193,7 @@ static int buffered_rate_limit(void *opaque) >> if (s->freeze_output) >> return 1; >> >> - if (s->bytes_xfer > s->xfer_limit) > o> + if (s->buffer_size > s->xfer_limit) >> return 1; >> >> return 0;
Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 20/11/2012 12:03, Juan Quintela ha scritto: >> This patch is wrong O;-) >> That don't mean that current code is right. >> >> We have 4 variables: >> - xfer_limit: how much we are allowed to send each 100ms (in bytes) >> - buffer_capacity: the size of the buffer that we are using >> - buffer_size: the amount of the previous buffer that we are using >> buffer_size < buffer_capacity, or we are doing something >> wrong. >> - bytes_xfer: How many bytes we have transfered since last 100ms timer. > > Note that the buffered_file places a wall between producer and consumer > (or rather tries to place it; my patch is an attempt to fix). The > consumer side is buffered_flush & bytes_xfer, and the rate-limiting > there is mandatory. > > However, on the producer side, the rate-limiting is only advisory, to > avoid making the buffer_capacity too large. In principle (and leaving > MAX_WAIT aside for a moment) you could skip qemu_file_rate_limit calls > completely. It would place the whole RAM in the buffer, and still > transfer it in xfer_limit chunks on the wire. this is the whole definition of rate-limiting O:-) Remember how this works, with callbacks. This was "another" of the reasons to move to a thread, to have finer control about that. If you look at latest patches you can see that I pass the "amount" of space in the buffer to the producer, and plan to just remove this calls (make completely no sense once that you moved to a thread). >> And how this work: >> - we have an input handler that copies stuff from RAM to buffer >> it stops when we have sent more that xfer_limit on this period (each >> 100ms) >> - we have another handler that is run each 100ms, and this one sent >> anything that is on the buffer, and reset bytes_xfer to zero. >> >> WHat you have done is just telling that in the 1st input handler, that >> we always have size on the buffer for it, so that we are not doing any >> rate limiting at all. >> >> It is very strange that buffer_capacity is bigger that xfer_limit (it >> could happen, but it is very unusual), and then what you have done there >> is just disabling rate limiting altogether. > > I haven't: once s->bytes_xfer >= s->xfer_limit, buffered_flush will not > do anything, and data will accumulate in the buffer until bytes_xfer is > moved back to 0. So what happens really is that ram_save_block is > already preparing the next chunk of data to send, but only s->xfer_limit > bytes are sent on each 100ms period. > > However, my patch was incomplete. The desired behavior is that > buffered_put_buffer(s, NULL, 0, 0) will restart the iteration, so it has > to check buffer_size too. In fact, the check in buffered_put_b > > There is also another bug in the current code, which is an off-by-one. > Comparison is s->bytes_xfer > s->xfer_limit, but it should be >= instead. > > And more somewhat broken checks. I'm testing a more complete fix. Later, Juan.
diff --git a/buffered_file.c b/buffered_file.c index bd0f61d..46cd591 100644 --- a/buffered_file.c +++ b/buffered_file.c @@ -193,7 +193,7 @@ static int buffered_rate_limit(void *opaque) if (s->freeze_output) return 1; - if (s->bytes_xfer > s->xfer_limit) + if (s->buffer_size > s->xfer_limit) return 1; return 0;
buffered_rate_limit is called to prevent the RAM migration callback from putting too much data in the buffer. So it has to check against the amount of data currently in the buffer, not against the amount of data that has been transferred so far. s->bytes_xfer is used to communicate between successive calls of buffered_put_buffer. Buffered_rate_tick resets it every now and then to prevent moving too much buffered data to the socket at once. However, its value does not matter for the producer of the data. Here is the result for migrating an idle guest with 3GB of memory and ~360MB of non-zero memory: migrate_set_speed Before After ------------------------------------------------------------ default (32MB/sec) ~3 sec ~13 sec infinite (10GB/sec) ~3 sec ~3 sec Note that before this patch, QEMU is transferring of 100 MB/sec despite the rate limiting. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- buffered_file.c | 2 +- 1 file modificato, 1 inserzione(+). 1 rimozione(-)