diff mbox

[ovs-dev,patch_v4,4/8] Userspace Datapath: Add ALG infra and FTP.

Message ID 20170713221514.GL29918@ovn.org
State Not Applicable
Headers show

Commit Message

Ben Pfaff July 13, 2017, 10:15 p.m. UTC
On Wed, Jul 05, 2017 at 09:32:22PM -0700, Darrell Ball wrote:
> ALG infra and FTP (both V4 and V6) support is added to the userspace
> datapath.  Also, NAT support is included.
> 
> Signed-off-by: Darrell Ball <dlu998@gmail.com>

Thanks a lot for working on this.  It will be a valuable feature that
brings the userspace datapath closer to the kernel datapath features.

I have some comments.  I'm not too familiar with this area, so a lot of
my comments are ones that should help to make the code easier to
understand not just for me but for other newbies to ALG implementation.

The data structures introduced or modified in this patch are almost
comment-free.  The reader is left to guess important information like
the purpose of the structure, as well as per-member info like what lists
or hmaps a structure belongs to, what a "master" is, and so on.  Please
add some comments.

The same seems warranted for the collection of macros that this
defines.  For example, what does FTP_MAX_PORT mean?  (If it's just the
maximum TCP port, what makes it FTP specific? etc.)

In the name "alg_exp_node", I don't know what an exp is.

I don't know what "delinate" means.

What OS X problem does this allude to?
+    /* CT_IPPORT_FTP is used in lieu of IPPORT_FTP as a workaround
+     * to handle OSX. */
+#define CT_IPPORT_FTP 21

Usually we put the conversion on the constant side in comparisons like
the following, because compilers are better at optimizing it:
+    return (ip_proto == IPPROTO_TCP &&
+            (ntohs(th->tcp_src) == CT_IPPORT_FTP ||
+             ntohs(th->tcp_dst) == CT_IPPORT_FTP));

I guess that not everyone knows what "tuple space" is.  I had to think
about it for a while.  I think you basically mean the ports available
for NAT.  Maybe a more user friendly term could be used (at least in
user messages)?  Why is exhaustion "likely a user error"?  (I would have
guessed that it is more likely from some kind of DoS or port scan or
equivalent.)  How should a user respond?

process_one() has a variable alg_nat_repl_addr that it zeros and then
appears to never use again.

Also in process_one(), I think that this memcpy:
            memcpy(&alg_exp_entry, alg_exp, sizeof alg_exp_entry);
can be written as just:
            alg_exp_entry = *alg_exp;
although I don't know whether you have some expectation for padding etc.

process_one() uses the expression "conn && is_ftp_ctl(pkt)" twice, and
the latter function is nontrivial.  Can it be evaluated just once?

In conntrack_execute(), it seems odd to move the loop index declaration
out of the for statement.

conn_to_ct_dpif_entry() does an xstrdup but I wasn't able to spot where
the string gets freed.

I can't see how get_v4_byte_be() works properly on both big-endian and
little-endian systems.

get_v4_byte_be() shouldn't need the "& 0xff", since it returns a
uint8_t.

In replace_substring(), it seems odd to use 8-bit quantities for sizes.
Also, it looks like 'delta' can be negative, in which case it should not
be plain char (which can be signed or unsigned given an ASCII character
set): if you really want 8-bit, use "signed char" or int8_t.

The expression "remain_substring + delta" in replace_substring() kind of
threw me for a loop.  If you expand this based on variable definitions,
you get:

        remain_substring + delta
     == (substr + substr_size) + (rep_str_size - substr_size)
     == substr + rep_str_size

In other words, the substr_size cancels out entirely, so that I think
the first four statements
+    char delta = rep_str_size - substr_size;
+    size_t move_size = total_size - substr_size;
+    char *remain_substring = substr + substr_size;
+    memmove(remain_substring + delta, remain_substring,
+            move_size);
might as well be written:
     memmove(substr + rep_str_size, substr + substr_size,
             total_size - substr_size);
which looks a lot simpler to me.

In repl_ftp_v4_addr(), don't replace_size and rc have the same value?
And there are two calls to strlen(rep_str).

I think that repl_ftp_v4_addr() would be easier to read if variables
were declared at first use, more like this:

Also in repl_ftp_v4_addr(), it seems odd that we're dealing with
maintaining a minimum frame length at this low level; I'd expect that to
happen at a higher level.  At least that's what I assume the MAX(...,
64) is for; maybe ETH_TOTAL_MIN would be better?

Instead of FTP_EPRT_CMD_SIZE (etc.), it might be better to just write
strlen(FTP_EPRT_CMD).  The meaning is clear at a glance and modern
compilers will optimize such an expression to a constant.

I found the callers of detect_ftp_ctl_mode() puzzling at first because
this function does two different things but each of its callers only
care about one of them.  How about dividing it into two functions, like
this?


This:
    /* Find first space. */
    while ((*ftp != ' ') && (*ftp != 0)) {
        ftp++;
    }
    if (*ftp != ' ') {
        return CT_FTP_CTL_INVALID;
    }
can be written as just:
    /* Find first space. */
    ftp = strchr(ftp, ' ');
    if (!ftp) {
        return CT_FTP_CTL_INVALID;
    }

I see 3 casts to integer types in process_ftp_ctl_v4() but none of them
appears to be necessary.  The (OVS_FORCE ovs_be16) cast seems especially
odd since htons() actually returns an ovs_be16.  Same in
process_ftp_ctl_v6().  Similarly for htonl(), twice, in handle_ftp_ctl().

I wonder whether the parsing in process_ftp_ctl_v4() would be easier
using sscanf().

Again, thanks a lot for implementing this feature!

Ben

Comments

Darrell Ball July 14, 2017, 5:37 p.m. UTC | #1
Thanks a lot for the review Ben

On 7/13/17, 3:15 PM, "ovs-dev-bounces@openvswitch.org on behalf of Ben Pfaff" <ovs-dev-bounces@openvswitch.org on behalf of blp@ovn.org> wrote:

    On Wed, Jul 05, 2017 at 09:32:22PM -0700, Darrell Ball wrote:
    > ALG infra and FTP (both V4 and V6) support is added to the userspace

    > datapath.  Also, NAT support is included.

    > 

    > Signed-off-by: Darrell Ball <dlu998@gmail.com>

    
    Thanks a lot for working on this.  It will be a valuable feature that
    brings the userspace datapath closer to the kernel datapath features.
    
    I have some comments.  I'm not too familiar with this area, so a lot of
    my comments are ones that should help to make the code easier to
    understand not just for me but for other newbies to ALG implementation.
    
    The data structures introduced or modified in this patch are almost
    comment-free.  The reader is left to guess important information like
    the purpose of the structure, as well as per-member info like what lists
    or hmaps a structure belongs to, what a "master" is, and so on.  Please
    add some comments.

Done
    
    The same seems warranted for the collection of macros that this
    defines.  For example, what does FTP_MAX_PORT mean?  (If it's just the
    maximum TCP port, what makes it FTP specific? etc.)

Done
    
    In the name "alg_exp_node", I don't know what an exp is.

Done
    
    I don't know what "delinate" means.

Name changed
    
    What OS X problem does this allude to?
    +    /* CT_IPPORT_FTP is used in lieu of IPPORT_FTP as a workaround
    +     * to handle OSX. */
    +#define CT_IPPORT_FTP 21

I added comment that it was build related.
    
    Usually we put the conversion on the constant side in comparisons like
    the following, because compilers are better at optimizing it:
    +    return (ip_proto == IPPROTO_TCP &&
    +            (ntohs(th->tcp_src) == CT_IPPORT_FTP ||
    +             ntohs(th->tcp_dst) == CT_IPPORT_FTP));

Got it
    
    I guess that not everyone knows what "tuple space" is.  I had to think
    about it for a while.  I think you basically mean the ports available
    for NAT.  Maybe a more user friendly term could be used (at least in
    user messages)?  Why is exhaustion "likely a user error"?  (I would have
    guessed that it is more likely from some kind of DoS or port scan or
    equivalent.)  How should a user respond?

Sure, DoS is valid too as the system may be unprotected. I expanded the
comments and added recommendations.

    
    process_one() has a variable alg_nat_repl_addr that it zeros and then
    appears to never use again.

I deprecated that variable use and I forgot to cleanup.
    
    Also in process_one(), I think that this memcpy:
                memcpy(&alg_exp_entry, alg_exp, sizeof alg_exp_entry);
    can be written as just:
                alg_exp_entry = *alg_exp;
    although I don't know whether you have some expectation for padding etc.

structure copy is fine here; I thought I had caught all of these.


    process_one() uses the expression "conn && is_ftp_ctl(pkt)" twice, and
    the latter function is nontrivial.  Can it be evaluated just once?

Yes, thanks for pointing that out.
    
    In conntrack_execute(), it seems odd to move the loop index declaration
    out of the for statement.

This is day one conntrack code, but I can fix it here.
    
    conn_to_ct_dpif_entry() does an xstrdup but I wasn't able to spot where
    the string gets freed.

This is an existing API that I just used; the caller (in dpctl) frees the string.
I added a comment that the caller frees it.
    
    I can't see how get_v4_byte_be() works properly on both big-endian and
    little-endian systems.
    
    get_v4_byte_be() shouldn't need the "& 0xff", since it returns a
    uint8_t.

right, mask is not needed
    
    In replace_substring(), it seems odd to use 8-bit quantities for sizes.
    Also, it looks like 'delta' can be negative, in which case it should not
    be plain char (which can be signed or unsigned given an ASCII character
    set): if you really want 8-bit, use "signed char" or int8_t.

Thanks for catching this; this is a real potential bug depending on the compiler 
and settings. At one point, I checked and converted all signed values to either
int or int64_t; this looks like the only one left over.
    
    The expression "remain_substring + delta" in replace_substring() kind of
    threw me for a loop.  If you expand this based on variable definitions,
    you get:
    
            remain_substring + delta
         == (substr + substr_size) + (rep_str_size - substr_size)
         == substr + rep_str_size
    
    In other words, the substr_size cancels out entirely, so that I think
    the first four statements
    +    char delta = rep_str_size - substr_size;
    +    size_t move_size = total_size - substr_size;
    +    char *remain_substring = substr + substr_size;
    +    memmove(remain_substring + delta, remain_substring,
    +            move_size);
    might as well be written:
         memmove(substr + rep_str_size, substr + substr_size,
                 total_size - substr_size);
    which looks a lot simpler to me.

I thought I was making the function easier to understand with the added variables 
of delta, move_size and remain_substring. I realize now that I was wrong.

    
    In repl_ftp_v4_addr(), don't replace_size and rc have the same value?
    And there are two calls to strlen(rep_str).

I simplified the function; thanks.
    
    I think that repl_ftp_v4_addr() would be easier to read if variables
    were declared at first use, more like this:

Done in a few functions missed earlier.
    
    Also in repl_ftp_v4_addr(), it seems odd that we're dealing with
    maintaining a minimum frame length at this low level; I'd expect that to
    happen at a higher level.  At least that's what I assume the MAX(...,
    64) is for; maybe ETH_TOTAL_MIN would be better?

This was really evil, I agree; this really needed a comment and I added one.
This is the right place to do this as the packet is potentially being
enlarged right here. The ‘64’ is power of 2 floor of these ftp packet sizes, 
not min Ethernet. It is just for safety; ‘64’ should never be the larger value.

    
    Instead of FTP_EPRT_CMD_SIZE (etc.), it might be better to just write
    strlen(FTP_EPRT_CMD).  The meaning is clear at a glance and modern
    compilers will optimize such an expression to a constant.

sure, I could not confirm strlen would be optimized in all cases, but I’ll take
your word for it. I considered sizeof(str literal) – 1, which I know works. Even
if strlen is not optimized, this case is not critical, so I used it.
    
    I found the callers of detect_ftp_ctl_mode() puzzling at first because
    this function does two different things but each of its callers only
    care about one of them.  How about dividing it into two functions, like
    this?

Sure, I knew this was ‘unclean’, but just never got to it.
Fixed now.

    
    diff --git a/lib/conntrack.c b/lib/conntrack.c
    index 8d40e9eb809d..5c58e660972d 100644
    --- a/lib/conntrack.c
    +++ b/lib/conntrack.c
    @@ -123,7 +123,6 @@ repl_ftp_v4_addr(struct dp_packet *pkt, ovs_be32 v4_addr_rep,
     
     static enum ftp_ctl_pkt
     process_ftp_ctl_v4(struct conntrack *ct,
    -                   const struct conn_lookup_ctx *ctx,
                        struct dp_packet *pkt,
                        const struct conn *conn_for_expectation,
                        long long now, ovs_be32 *v4_addr_rep,
    @@ -132,7 +131,7 @@ process_ftp_ctl_v4(struct conntrack *ct,
     
     static enum ftp_ctl_pkt
     detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
    -                    struct dp_packet *pkt, char *ftp_msg);
    +                    struct dp_packet *pkt);
     
     static void
     handle_ftp_ctl(struct conntrack *ct, const struct conn_lookup_ctx *ctx,
    @@ -2477,9 +2476,8 @@ delinate_number(char *str, uint8_t max_digits)
         return str;
     }
     
    -static enum ftp_ctl_pkt
    -detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
    -                    struct dp_packet *pkt, char *ftp_msg)
    +static void
    +get_ftp_ctl_msg(struct dp_packet *pkt, char *ftp_msg)
     {
         struct tcp_header *th = dp_packet_l4(pkt);
         char *tcp_hdr = (char *) th;
    @@ -2491,6 +2489,13 @@ detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
     
         ovs_strlcpy(ftp_msg, tcp_hdr + tcp_hdr_len,
                     tcp_payload_of_interest);
    +}
    +
    +static enum ftp_ctl_pkt
    +detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx, struct dp_packet *pkt)
    +{
    +    char ftp_msg[LARGEST_FTP_MSG_OF_INTEREST + 1] = {0};
    +    get_ftp_ctl_msg(pkt, ftp_msg);
     
         if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
             if (strncasecmp(ftp_msg, FTP_EPRT_CMD, FTP_EPRT_CMD_SIZE) &&
    @@ -2510,7 +2515,6 @@ detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
     
     static enum ftp_ctl_pkt
     process_ftp_ctl_v4(struct conntrack *ct,
    -                   const struct conn_lookup_ctx *ctx,
                        struct dp_packet *pkt,
                        const struct conn *conn_for_expectation,
                        long long now, ovs_be32 *v4_addr_rep,
    @@ -2525,7 +2529,7 @@ process_ftp_ctl_v4(struct conntrack *ct,
         char *ftp = ftp_msg;
         enum ct_alg_mode mode;
     
    -     detect_ftp_ctl_mode(ctx, pkt, ftp_msg);
    +    get_ftp_ctl_msg(pkt, ftp_msg);
         *ftp_data_v4_start = tcp_hdr + tcp_hdr_len;
     
         if (!strncasecmp(ftp_msg, FTP_PORT_CMD, FTP_PORT_CMD_SIZE)) {
    @@ -2652,7 +2656,6 @@ skip_ipv6_digits(char *str)
     
     static enum ftp_ctl_pkt
     process_ftp_ctl_v6(struct conntrack *ct,
    -                   const struct conn_lookup_ctx *ctx,
                        struct dp_packet *pkt,
                        const struct conn *conn_for_expectation,
                        long long now,
    @@ -2668,7 +2671,7 @@ process_ftp_ctl_v6(struct conntrack *ct,
         char *ftp = ftp_msg;
         struct in6_addr ip6_addr;
     
    -     detect_ftp_ctl_mode(ctx, pkt, ftp_msg);
    +    get_ftp_ctl_msg(pkt, ftp_msg);
         *ftp_data_start = tcp_hdr + tcp_hdr_len;
     
         if (!strncasecmp(ftp_msg, FTP_EPRT_CMD, FTP_EPRT_CMD_SIZE)) {
    @@ -2803,13 +2806,12 @@ handle_ftp_ctl(struct conntrack *ct, const struct conn_lookup_ctx *ctx,
         size_t addr_offset_from_ftp_data_start;
         int64_t seq_skew = 0;
         bool seq_skew_dir;
    -    char ftp_msg[LARGEST_FTP_MSG_OF_INTEREST + 1] = {0};
         size_t addr_size = 0;
         char *ftp_data_start;
         bool do_seq_skew_adj = true;
         enum ct_alg_mode mode = CT_FTP_MODE_ACTIVE;
     
    -    if (detect_ftp_ctl_mode(ctx, pkt, ftp_msg) != ftp_ctl) {
    +    if (detect_ftp_ctl_mode(ctx, pkt) != ftp_ctl) {
             return;
         }
     
    @@ -2823,12 +2825,12 @@ handle_ftp_ctl(struct conntrack *ct, const struct conn_lookup_ctx *ctx,
         } else if (ftp_ctl == CT_FTP_CTL_INTEREST) {
             enum ftp_ctl_pkt rc;
             if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
    -            rc = process_ftp_ctl_v6(ct, ctx, pkt, conn_for_expectation,
    +            rc = process_ftp_ctl_v6(ct, pkt, conn_for_expectation,
                                         now, &v6_addr_rep, &ftp_data_start,
                                         &addr_offset_from_ftp_data_start,
                                         &addr_size, &mode);
             } else {
    -            rc = process_ftp_ctl_v4(ct, ctx, pkt, conn_for_expectation,
    +            rc = process_ftp_ctl_v4(ct, pkt, conn_for_expectation,
                                         now, &v4_addr_rep, &ftp_data_start,
                                         &addr_offset_from_ftp_data_start);
             }
    
    This:
        /* Find first space. */
        while ((*ftp != ' ') && (*ftp != 0)) {
            ftp++;
        }
        if (*ftp != ' ') {
            return CT_FTP_CTL_INVALID;
        }
    can be written as just:
        /* Find first space. */
        ftp = strchr(ftp, ' ');
        if (!ftp) {
            return CT_FTP_CTL_INVALID;
        }

I did not measure the performance, but it is not critical here anyways.
strchr is cleaner and I used it.
    
    I see 3 casts to integer types in process_ftp_ctl_v4() but none of them
    appears to be necessary.  The (OVS_FORCE ovs_be16) cast seems especially
    odd since htons() actually returns an ovs_be16.  Same in
    process_ftp_ctl_v6().  Similarly for htonl(), twice, in handle_ftp_ctl().

I had simplified the code without removing the htons/l casts, somehow.
    
    I wonder whether the parsing in process_ftp_ctl_v4() would be easier
    using sscanf().

I had considered it first; the problem is the practical variability of legacy 
FTP message formats with different servers and clients. It would break in
corner cases.
    
    Again, thanks a lot for implementing this feature!
    
    Ben
    _______________________________________________
    dev mailing list
    dev@openvswitch.org
    https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=27h6skZaWHS36bASXEcQwe42XKJzVxLlkNHSXmK-mBk&s=jSXkenV2kpguUUU_EO31062IXf4CUHPvV-H4Y6dt3JQ&e=
Darrell Ball July 14, 2017, 7:02 p.m. UTC | #2
Correction inline 

On 7/14/17, 10:37 AM, "ovs-dev-bounces@openvswitch.org on behalf of Darrell Ball" <ovs-dev-bounces@openvswitch.org on behalf of dball@vmware.com> wrote:

    Thanks a lot for the review Ben
    
    
    
    On 7/13/17, 3:15 PM, "ovs-dev-bounces@openvswitch.org on behalf of Ben Pfaff" <ovs-dev-bounces@openvswitch.org on behalf of blp@ovn.org> wrote:
    
    
    
        On Wed, Jul 05, 2017 at 09:32:22PM -0700, Darrell Ball wrote:
    
        > ALG infra and FTP (both V4 and V6) support is added to the userspace

    
        > datapath.  Also, NAT support is included.

    
        > 

    
        > Signed-off-by: Darrell Ball <dlu998@gmail.com>

    
        
    
        Thanks a lot for working on this.  It will be a valuable feature that
    
        brings the userspace datapath closer to the kernel datapath features.
    
        
    
        I have some comments.  I'm not too familiar with this area, so a lot of
    
        my comments are ones that should help to make the code easier to
    
        understand not just for me but for other newbies to ALG implementation.
    
        
    
        The data structures introduced or modified in this patch are almost
    
        comment-free.  The reader is left to guess important information like
    
        the purpose of the structure, as well as per-member info like what lists
    
        or hmaps a structure belongs to, what a "master" is, and so on.  Please
    
        add some comments.
    
    
    
    Done
    
        
    
        The same seems warranted for the collection of macros that this
    
        defines.  For example, what does FTP_MAX_PORT mean?  (If it's just the
    
        maximum TCP port, what makes it FTP specific? etc.)
    
    
    
    Done
    
        
    
        In the name "alg_exp_node", I don't know what an exp is.
    
    
    
    Done
    
        
    
        I don't know what "delinate" means.
    
    
    
    Name changed
    
        
    
        What OS X problem does this allude to?
    
        +    /* CT_IPPORT_FTP is used in lieu of IPPORT_FTP as a workaround
    
        +     * to handle OSX. */
    
        +#define CT_IPPORT_FTP 21
    
    
    
    I added comment that it was build related.
    
        
    
        Usually we put the conversion on the constant side in comparisons like
    
        the following, because compilers are better at optimizing it:
    
        +    return (ip_proto == IPPROTO_TCP &&
    
        +            (ntohs(th->tcp_src) == CT_IPPORT_FTP ||
    
        +             ntohs(th->tcp_dst) == CT_IPPORT_FTP));
    
    
    
    Got it
    
        
    
        I guess that not everyone knows what "tuple space" is.  I had to think
    
        about it for a while.  I think you basically mean the ports available
    
        for NAT.  Maybe a more user friendly term could be used (at least in
    
        user messages)?  Why is exhaustion "likely a user error"?  (I would have
    
        guessed that it is more likely from some kind of DoS or port scan or
    
        equivalent.)  How should a user respond?
    
    
    
    Sure, DoS is valid too as the system may be unprotected. I expanded the
    
    comments and added recommendations.
    
    
    
        
    
        process_one() has a variable alg_nat_repl_addr that it zeros and then
    
        appears to never use again.
    
    
    
    I deprecated that variable use and I forgot to cleanup.
    
        
    
        Also in process_one(), I think that this memcpy:
    
                    memcpy(&alg_exp_entry, alg_exp, sizeof alg_exp_entry);
    
        can be written as just:
    
                    alg_exp_entry = *alg_exp;
    
        although I don't know whether you have some expectation for padding etc.
    
    
    
    structure copy is fine here; I thought I had caught all of these.
    
    
    
    
    
        process_one() uses the expression "conn && is_ftp_ctl(pkt)" twice, and
    
        the latter function is nontrivial.  Can it be evaluated just once?
    
    
    
    Yes, thanks for pointing that out.
    
        
    
        In conntrack_execute(), it seems odd to move the loop index declaration
    
        out of the for statement.
    
    
    
    This is day one conntrack code, but I can fix it here.
    
        
    
        conn_to_ct_dpif_entry() does an xstrdup but I wasn't able to spot where
    
        the string gets freed.
    
    
    
    This is an existing API that I just used; the caller (in dpctl) frees the string.
    
    I added a comment that the caller frees it.
    
        
    
        I can't see how get_v4_byte_be() works properly on both big-endian and
    
        little-endian systems.
    
        
    
        get_v4_byte_be() shouldn't need the "& 0xff", since it returns a
    
        uint8_t.
    
    
    
    right, mask is not needed
    
        
    
        In replace_substring(), it seems odd to use 8-bit quantities for sizes.
    
        Also, it looks like 'delta' can be negative, in which case it should not
    
        be plain char (which can be signed or unsigned given an ASCII character
    
        set): if you really want 8-bit, use "signed char" or int8_t.
    
    
    
    Thanks for catching this; this is a real potential bug depending on the compiler 
    
    and settings. At one point, I checked and converted all signed values to either
    
    int or int64_t; this looks like the only one left over.
    
        
    
        The expression "remain_substring + delta" in replace_substring() kind of
    
        threw me for a loop.  If you expand this based on variable definitions,
    
        you get:
    
        
    
                remain_substring + delta
    
             == (substr + substr_size) + (rep_str_size - substr_size)
    
             == substr + rep_str_size
    
        
    
        In other words, the substr_size cancels out entirely, so that I think
    
        the first four statements
    
        +    char delta = rep_str_size - substr_size;
    
        +    size_t move_size = total_size - substr_size;
    
        +    char *remain_substring = substr + substr_size;
    
        +    memmove(remain_substring + delta, remain_substring,
    
        +            move_size);
    
        might as well be written:
    
             memmove(substr + rep_str_size, substr + substr_size,
    
                     total_size - substr_size);
    
        which looks a lot simpler to me.
    
    
    
    I thought I was making the function easier to understand with the added variables 
    
    of delta, move_size and remain_substring. I realize now that I was wrong.
    
    
    
        
    
        In repl_ftp_v4_addr(), don't replace_size and rc have the same value?
    
        And there are two calls to strlen(rep_str).
    
    
    
    I simplified the function; thanks.
    
        
    
        I think that repl_ftp_v4_addr() would be easier to read if variables
    
        were declared at first use, more like this:
    
    
    
    Done in a few functions missed earlier.
    
        
    
        Also in repl_ftp_v4_addr(), it seems odd that we're dealing with
    
        maintaining a minimum frame length at this low level; I'd expect that to
    
        happen at a higher level.  At least that's what I assume the MAX(...,
    
        64) is for; maybe ETH_TOTAL_MIN would be better?
    
    
    
    This was really evil, I agree; this really needed a comment and I added one.
    
    This is the right place to do this as the packet is potentially being
    
    enlarged right here. The ‘64’ is power of 2 floor of these ftp packet sizes, 
    
    not min Ethernet. It is just for safety; ‘64’ should never be the larger value.
    
  >>>>>>>>>>>>>> 

I thought about this a little more.
The MAX will have no actual effect and is causing confusion.
I removed it.
<<<<<<<<<<<<<<<
        
    
        Instead of FTP_EPRT_CMD_SIZE (etc.), it might be better to just write
    
        strlen(FTP_EPRT_CMD).  The meaning is clear at a glance and modern
    
        compilers will optimize such an expression to a constant.
    
    
    
    sure, I could not confirm strlen would be optimized in all cases, but I’ll take
    
    your word for it. I considered sizeof(str literal) – 1, which I know works. Even
    
    if strlen is not optimized, this case is not critical, so I used it.
    
        
    
        I found the callers of detect_ftp_ctl_mode() puzzling at first because
    
        this function does two different things but each of its callers only
    
        care about one of them.  How about dividing it into two functions, like
    
        this?
    
    
    
    Sure, I knew this was ‘unclean’, but just never got to it.
    
    Fixed now.
    
    
    
        
    
        diff --git a/lib/conntrack.c b/lib/conntrack.c
    
        index 8d40e9eb809d..5c58e660972d 100644
    
        --- a/lib/conntrack.c
    
        +++ b/lib/conntrack.c
    
        @@ -123,7 +123,6 @@ repl_ftp_v4_addr(struct dp_packet *pkt, ovs_be32 v4_addr_rep,
    
         
    
         static enum ftp_ctl_pkt
    
         process_ftp_ctl_v4(struct conntrack *ct,
    
        -                   const struct conn_lookup_ctx *ctx,
    
                            struct dp_packet *pkt,
    
                            const struct conn *conn_for_expectation,
    
                            long long now, ovs_be32 *v4_addr_rep,
    
        @@ -132,7 +131,7 @@ process_ftp_ctl_v4(struct conntrack *ct,
    
         
    
         static enum ftp_ctl_pkt
    
         detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
    
        -                    struct dp_packet *pkt, char *ftp_msg);
    
        +                    struct dp_packet *pkt);
    
         
    
         static void
    
         handle_ftp_ctl(struct conntrack *ct, const struct conn_lookup_ctx *ctx,
    
        @@ -2477,9 +2476,8 @@ delinate_number(char *str, uint8_t max_digits)
    
             return str;
    
         }
    
         
    
        -static enum ftp_ctl_pkt
    
        -detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
    
        -                    struct dp_packet *pkt, char *ftp_msg)
    
        +static void
    
        +get_ftp_ctl_msg(struct dp_packet *pkt, char *ftp_msg)
    
         {
    
             struct tcp_header *th = dp_packet_l4(pkt);
    
             char *tcp_hdr = (char *) th;
    
        @@ -2491,6 +2489,13 @@ detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
    
         
    
             ovs_strlcpy(ftp_msg, tcp_hdr + tcp_hdr_len,
    
                         tcp_payload_of_interest);
    
        +}
    
        +
    
        +static enum ftp_ctl_pkt
    
        +detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx, struct dp_packet *pkt)
    
        +{
    
        +    char ftp_msg[LARGEST_FTP_MSG_OF_INTEREST + 1] = {0};
    
        +    get_ftp_ctl_msg(pkt, ftp_msg);
    
         
    
             if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
    
                 if (strncasecmp(ftp_msg, FTP_EPRT_CMD, FTP_EPRT_CMD_SIZE) &&
    
        @@ -2510,7 +2515,6 @@ detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
    
         
    
         static enum ftp_ctl_pkt
    
         process_ftp_ctl_v4(struct conntrack *ct,
    
        -                   const struct conn_lookup_ctx *ctx,
    
                            struct dp_packet *pkt,
    
                            const struct conn *conn_for_expectation,
    
                            long long now, ovs_be32 *v4_addr_rep,
    
        @@ -2525,7 +2529,7 @@ process_ftp_ctl_v4(struct conntrack *ct,
    
             char *ftp = ftp_msg;
    
             enum ct_alg_mode mode;
    
         
    
        -     detect_ftp_ctl_mode(ctx, pkt, ftp_msg);
    
        +    get_ftp_ctl_msg(pkt, ftp_msg);
    
             *ftp_data_v4_start = tcp_hdr + tcp_hdr_len;
    
         
    
             if (!strncasecmp(ftp_msg, FTP_PORT_CMD, FTP_PORT_CMD_SIZE)) {
    
        @@ -2652,7 +2656,6 @@ skip_ipv6_digits(char *str)
    
         
    
         static enum ftp_ctl_pkt
    
         process_ftp_ctl_v6(struct conntrack *ct,
    
        -                   const struct conn_lookup_ctx *ctx,
    
                            struct dp_packet *pkt,
    
                            const struct conn *conn_for_expectation,
    
                            long long now,
    
        @@ -2668,7 +2671,7 @@ process_ftp_ctl_v6(struct conntrack *ct,
    
             char *ftp = ftp_msg;
    
             struct in6_addr ip6_addr;
    
         
    
        -     detect_ftp_ctl_mode(ctx, pkt, ftp_msg);
    
        +    get_ftp_ctl_msg(pkt, ftp_msg);
    
             *ftp_data_start = tcp_hdr + tcp_hdr_len;
    
         
    
             if (!strncasecmp(ftp_msg, FTP_EPRT_CMD, FTP_EPRT_CMD_SIZE)) {
    
        @@ -2803,13 +2806,12 @@ handle_ftp_ctl(struct conntrack *ct, const struct conn_lookup_ctx *ctx,
    
             size_t addr_offset_from_ftp_data_start;
    
             int64_t seq_skew = 0;
    
             bool seq_skew_dir;
    
        -    char ftp_msg[LARGEST_FTP_MSG_OF_INTEREST + 1] = {0};
    
             size_t addr_size = 0;
    
             char *ftp_data_start;
    
             bool do_seq_skew_adj = true;
    
             enum ct_alg_mode mode = CT_FTP_MODE_ACTIVE;
    
         
    
        -    if (detect_ftp_ctl_mode(ctx, pkt, ftp_msg) != ftp_ctl) {
    
        +    if (detect_ftp_ctl_mode(ctx, pkt) != ftp_ctl) {
    
                 return;
    
             }
    
         
    
        @@ -2823,12 +2825,12 @@ handle_ftp_ctl(struct conntrack *ct, const struct conn_lookup_ctx *ctx,
    
             } else if (ftp_ctl == CT_FTP_CTL_INTEREST) {
    
                 enum ftp_ctl_pkt rc;
    
                 if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
    
        -            rc = process_ftp_ctl_v6(ct, ctx, pkt, conn_for_expectation,
    
        +            rc = process_ftp_ctl_v6(ct, pkt, conn_for_expectation,
    
                                             now, &v6_addr_rep, &ftp_data_start,
    
                                             &addr_offset_from_ftp_data_start,
    
                                             &addr_size, &mode);
    
                 } else {
    
        -            rc = process_ftp_ctl_v4(ct, ctx, pkt, conn_for_expectation,
    
        +            rc = process_ftp_ctl_v4(ct, pkt, conn_for_expectation,
    
                                             now, &v4_addr_rep, &ftp_data_start,
    
                                             &addr_offset_from_ftp_data_start);
    
                 }
    
        
    
        This:
    
            /* Find first space. */
    
            while ((*ftp != ' ') && (*ftp != 0)) {
    
                ftp++;
    
            }
    
            if (*ftp != ' ') {
    
                return CT_FTP_CTL_INVALID;
    
            }
    
        can be written as just:
    
            /* Find first space. */
    
            ftp = strchr(ftp, ' ');
    
            if (!ftp) {
    
                return CT_FTP_CTL_INVALID;
    
            }
    
    
    
    I did not measure the performance, but it is not critical here anyways.
    
    strchr is cleaner and I used it.
    
        
    
        I see 3 casts to integer types in process_ftp_ctl_v4() but none of them
    
        appears to be necessary.  The (OVS_FORCE ovs_be16) cast seems especially
    
        odd since htons() actually returns an ovs_be16.  Same in
    
        process_ftp_ctl_v6().  Similarly for htonl(), twice, in handle_ftp_ctl().
    
    
    
    I had simplified the code without removing the htons/l casts, somehow.
    
        
    
        I wonder whether the parsing in process_ftp_ctl_v4() would be easier
    
        using sscanf().
    
    
    
    I had considered it first; the problem is the practical variability of legacy 
    
    FTP message formats with different servers and clients. It would break in
    
    corner cases.
    
        
    
        Again, thanks a lot for implementing this feature!
    
        
    
        Ben
    
        _______________________________________________
    
        dev mailing list
    
        dev@openvswitch.org
    
        https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=27h6skZaWHS36bASXEcQwe42XKJzVxLlkNHSXmK-mBk&s=jSXkenV2kpguUUU_EO31062IXf4CUHPvV-H4Y6dt3JQ&e= 
    
        
    
    
    
    _______________________________________________
    dev mailing list
    dev@openvswitch.org
    https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwIGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=idV5Ta99okL-wSYV5GIqbki2CFZH5kHm7NLxOmenuaM&s=hL8lsp4iRpxW74XEb8zVmilRD8EzgDd4ZphmytC1oTA&e=
Ben Pfaff July 14, 2017, 7:35 p.m. UTC | #3
Wow, that was fast.

I'll look forward to v5.

On Fri, Jul 14, 2017 at 05:37:43PM +0000, Darrell Ball wrote:
> Thanks a lot for the review Ben
> 
> On 7/13/17, 3:15 PM, "ovs-dev-bounces@openvswitch.org on behalf of Ben Pfaff" <ovs-dev-bounces@openvswitch.org on behalf of blp@ovn.org> wrote:
> 
>     On Wed, Jul 05, 2017 at 09:32:22PM -0700, Darrell Ball wrote:
>     > ALG infra and FTP (both V4 and V6) support is added to the userspace
>     > datapath.  Also, NAT support is included.
>     > 
>     > Signed-off-by: Darrell Ball <dlu998@gmail.com>
>     
>     Thanks a lot for working on this.  It will be a valuable feature that
>     brings the userspace datapath closer to the kernel datapath features.
>     
>     I have some comments.  I'm not too familiar with this area, so a lot of
>     my comments are ones that should help to make the code easier to
>     understand not just for me but for other newbies to ALG implementation.
>     
>     The data structures introduced or modified in this patch are almost
>     comment-free.  The reader is left to guess important information like
>     the purpose of the structure, as well as per-member info like what lists
>     or hmaps a structure belongs to, what a "master" is, and so on.  Please
>     add some comments.
> 
> Done
>     
>     The same seems warranted for the collection of macros that this
>     defines.  For example, what does FTP_MAX_PORT mean?  (If it's just the
>     maximum TCP port, what makes it FTP specific? etc.)
> 
> Done
>     
>     In the name "alg_exp_node", I don't know what an exp is.
> 
> Done
>     
>     I don't know what "delinate" means.
> 
> Name changed
>     
>     What OS X problem does this allude to?
>     +    /* CT_IPPORT_FTP is used in lieu of IPPORT_FTP as a workaround
>     +     * to handle OSX. */
>     +#define CT_IPPORT_FTP 21
> 
> I added comment that it was build related.
>     
>     Usually we put the conversion on the constant side in comparisons like
>     the following, because compilers are better at optimizing it:
>     +    return (ip_proto == IPPROTO_TCP &&
>     +            (ntohs(th->tcp_src) == CT_IPPORT_FTP ||
>     +             ntohs(th->tcp_dst) == CT_IPPORT_FTP));
> 
> Got it
>     
>     I guess that not everyone knows what "tuple space" is.  I had to think
>     about it for a while.  I think you basically mean the ports available
>     for NAT.  Maybe a more user friendly term could be used (at least in
>     user messages)?  Why is exhaustion "likely a user error"?  (I would have
>     guessed that it is more likely from some kind of DoS or port scan or
>     equivalent.)  How should a user respond?
> 
> Sure, DoS is valid too as the system may be unprotected. I expanded the
> comments and added recommendations.
> 
>     
>     process_one() has a variable alg_nat_repl_addr that it zeros and then
>     appears to never use again.
> 
> I deprecated that variable use and I forgot to cleanup.
>     
>     Also in process_one(), I think that this memcpy:
>                 memcpy(&alg_exp_entry, alg_exp, sizeof alg_exp_entry);
>     can be written as just:
>                 alg_exp_entry = *alg_exp;
>     although I don't know whether you have some expectation for padding etc.
> 
> structure copy is fine here; I thought I had caught all of these.
> 
> 
>     process_one() uses the expression "conn && is_ftp_ctl(pkt)" twice, and
>     the latter function is nontrivial.  Can it be evaluated just once?
> 
> Yes, thanks for pointing that out.
>     
>     In conntrack_execute(), it seems odd to move the loop index declaration
>     out of the for statement.
> 
> This is day one conntrack code, but I can fix it here.
>     
>     conn_to_ct_dpif_entry() does an xstrdup but I wasn't able to spot where
>     the string gets freed.
> 
> This is an existing API that I just used; the caller (in dpctl) frees the string.
> I added a comment that the caller frees it.
>     
>     I can't see how get_v4_byte_be() works properly on both big-endian and
>     little-endian systems.
>     
>     get_v4_byte_be() shouldn't need the "& 0xff", since it returns a
>     uint8_t.
> 
> right, mask is not needed
>     
>     In replace_substring(), it seems odd to use 8-bit quantities for sizes.
>     Also, it looks like 'delta' can be negative, in which case it should not
>     be plain char (which can be signed or unsigned given an ASCII character
>     set): if you really want 8-bit, use "signed char" or int8_t.
> 
> Thanks for catching this; this is a real potential bug depending on the compiler 
> and settings. At one point, I checked and converted all signed values to either
> int or int64_t; this looks like the only one left over.
>     
>     The expression "remain_substring + delta" in replace_substring() kind of
>     threw me for a loop.  If you expand this based on variable definitions,
>     you get:
>     
>             remain_substring + delta
>          == (substr + substr_size) + (rep_str_size - substr_size)
>          == substr + rep_str_size
>     
>     In other words, the substr_size cancels out entirely, so that I think
>     the first four statements
>     +    char delta = rep_str_size - substr_size;
>     +    size_t move_size = total_size - substr_size;
>     +    char *remain_substring = substr + substr_size;
>     +    memmove(remain_substring + delta, remain_substring,
>     +            move_size);
>     might as well be written:
>          memmove(substr + rep_str_size, substr + substr_size,
>                  total_size - substr_size);
>     which looks a lot simpler to me.
> 
> I thought I was making the function easier to understand with the added variables 
> of delta, move_size and remain_substring. I realize now that I was wrong.
> 
>     
>     In repl_ftp_v4_addr(), don't replace_size and rc have the same value?
>     And there are two calls to strlen(rep_str).
> 
> I simplified the function; thanks.
>     
>     I think that repl_ftp_v4_addr() would be easier to read if variables
>     were declared at first use, more like this:
> 
> Done in a few functions missed earlier.
>     
>     Also in repl_ftp_v4_addr(), it seems odd that we're dealing with
>     maintaining a minimum frame length at this low level; I'd expect that to
>     happen at a higher level.  At least that's what I assume the MAX(...,
>     64) is for; maybe ETH_TOTAL_MIN would be better?
> 
> This was really evil, I agree; this really needed a comment and I added one.
> This is the right place to do this as the packet is potentially being
> enlarged right here. The ‘64’ is power of 2 floor of these ftp packet sizes, 
> not min Ethernet. It is just for safety; ‘64’ should never be the larger value.
> 
>     
>     Instead of FTP_EPRT_CMD_SIZE (etc.), it might be better to just write
>     strlen(FTP_EPRT_CMD).  The meaning is clear at a glance and modern
>     compilers will optimize such an expression to a constant.
> 
> sure, I could not confirm strlen would be optimized in all cases, but I’ll take
> your word for it. I considered sizeof(str literal) – 1, which I know works. Even
> if strlen is not optimized, this case is not critical, so I used it.
>     
>     I found the callers of detect_ftp_ctl_mode() puzzling at first because
>     this function does two different things but each of its callers only
>     care about one of them.  How about dividing it into two functions, like
>     this?
> 
> Sure, I knew this was ‘unclean’, but just never got to it.
> Fixed now.
> 
>     
>     diff --git a/lib/conntrack.c b/lib/conntrack.c
>     index 8d40e9eb809d..5c58e660972d 100644
>     --- a/lib/conntrack.c
>     +++ b/lib/conntrack.c
>     @@ -123,7 +123,6 @@ repl_ftp_v4_addr(struct dp_packet *pkt, ovs_be32 v4_addr_rep,
>      
>      static enum ftp_ctl_pkt
>      process_ftp_ctl_v4(struct conntrack *ct,
>     -                   const struct conn_lookup_ctx *ctx,
>                         struct dp_packet *pkt,
>                         const struct conn *conn_for_expectation,
>                         long long now, ovs_be32 *v4_addr_rep,
>     @@ -132,7 +131,7 @@ process_ftp_ctl_v4(struct conntrack *ct,
>      
>      static enum ftp_ctl_pkt
>      detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
>     -                    struct dp_packet *pkt, char *ftp_msg);
>     +                    struct dp_packet *pkt);
>      
>      static void
>      handle_ftp_ctl(struct conntrack *ct, const struct conn_lookup_ctx *ctx,
>     @@ -2477,9 +2476,8 @@ delinate_number(char *str, uint8_t max_digits)
>          return str;
>      }
>      
>     -static enum ftp_ctl_pkt
>     -detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
>     -                    struct dp_packet *pkt, char *ftp_msg)
>     +static void
>     +get_ftp_ctl_msg(struct dp_packet *pkt, char *ftp_msg)
>      {
>          struct tcp_header *th = dp_packet_l4(pkt);
>          char *tcp_hdr = (char *) th;
>     @@ -2491,6 +2489,13 @@ detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
>      
>          ovs_strlcpy(ftp_msg, tcp_hdr + tcp_hdr_len,
>                      tcp_payload_of_interest);
>     +}
>     +
>     +static enum ftp_ctl_pkt
>     +detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx, struct dp_packet *pkt)
>     +{
>     +    char ftp_msg[LARGEST_FTP_MSG_OF_INTEREST + 1] = {0};
>     +    get_ftp_ctl_msg(pkt, ftp_msg);
>      
>          if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
>              if (strncasecmp(ftp_msg, FTP_EPRT_CMD, FTP_EPRT_CMD_SIZE) &&
>     @@ -2510,7 +2515,6 @@ detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
>      
>      static enum ftp_ctl_pkt
>      process_ftp_ctl_v4(struct conntrack *ct,
>     -                   const struct conn_lookup_ctx *ctx,
>                         struct dp_packet *pkt,
>                         const struct conn *conn_for_expectation,
>                         long long now, ovs_be32 *v4_addr_rep,
>     @@ -2525,7 +2529,7 @@ process_ftp_ctl_v4(struct conntrack *ct,
>          char *ftp = ftp_msg;
>          enum ct_alg_mode mode;
>      
>     -     detect_ftp_ctl_mode(ctx, pkt, ftp_msg);
>     +    get_ftp_ctl_msg(pkt, ftp_msg);
>          *ftp_data_v4_start = tcp_hdr + tcp_hdr_len;
>      
>          if (!strncasecmp(ftp_msg, FTP_PORT_CMD, FTP_PORT_CMD_SIZE)) {
>     @@ -2652,7 +2656,6 @@ skip_ipv6_digits(char *str)
>      
>      static enum ftp_ctl_pkt
>      process_ftp_ctl_v6(struct conntrack *ct,
>     -                   const struct conn_lookup_ctx *ctx,
>                         struct dp_packet *pkt,
>                         const struct conn *conn_for_expectation,
>                         long long now,
>     @@ -2668,7 +2671,7 @@ process_ftp_ctl_v6(struct conntrack *ct,
>          char *ftp = ftp_msg;
>          struct in6_addr ip6_addr;
>      
>     -     detect_ftp_ctl_mode(ctx, pkt, ftp_msg);
>     +    get_ftp_ctl_msg(pkt, ftp_msg);
>          *ftp_data_start = tcp_hdr + tcp_hdr_len;
>      
>          if (!strncasecmp(ftp_msg, FTP_EPRT_CMD, FTP_EPRT_CMD_SIZE)) {
>     @@ -2803,13 +2806,12 @@ handle_ftp_ctl(struct conntrack *ct, const struct conn_lookup_ctx *ctx,
>          size_t addr_offset_from_ftp_data_start;
>          int64_t seq_skew = 0;
>          bool seq_skew_dir;
>     -    char ftp_msg[LARGEST_FTP_MSG_OF_INTEREST + 1] = {0};
>          size_t addr_size = 0;
>          char *ftp_data_start;
>          bool do_seq_skew_adj = true;
>          enum ct_alg_mode mode = CT_FTP_MODE_ACTIVE;
>      
>     -    if (detect_ftp_ctl_mode(ctx, pkt, ftp_msg) != ftp_ctl) {
>     +    if (detect_ftp_ctl_mode(ctx, pkt) != ftp_ctl) {
>              return;
>          }
>      
>     @@ -2823,12 +2825,12 @@ handle_ftp_ctl(struct conntrack *ct, const struct conn_lookup_ctx *ctx,
>          } else if (ftp_ctl == CT_FTP_CTL_INTEREST) {
>              enum ftp_ctl_pkt rc;
>              if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
>     -            rc = process_ftp_ctl_v6(ct, ctx, pkt, conn_for_expectation,
>     +            rc = process_ftp_ctl_v6(ct, pkt, conn_for_expectation,
>                                          now, &v6_addr_rep, &ftp_data_start,
>                                          &addr_offset_from_ftp_data_start,
>                                          &addr_size, &mode);
>              } else {
>     -            rc = process_ftp_ctl_v4(ct, ctx, pkt, conn_for_expectation,
>     +            rc = process_ftp_ctl_v4(ct, pkt, conn_for_expectation,
>                                          now, &v4_addr_rep, &ftp_data_start,
>                                          &addr_offset_from_ftp_data_start);
>              }
>     
>     This:
>         /* Find first space. */
>         while ((*ftp != ' ') && (*ftp != 0)) {
>             ftp++;
>         }
>         if (*ftp != ' ') {
>             return CT_FTP_CTL_INVALID;
>         }
>     can be written as just:
>         /* Find first space. */
>         ftp = strchr(ftp, ' ');
>         if (!ftp) {
>             return CT_FTP_CTL_INVALID;
>         }
> 
> I did not measure the performance, but it is not critical here anyways.
> strchr is cleaner and I used it.
>     
>     I see 3 casts to integer types in process_ftp_ctl_v4() but none of them
>     appears to be necessary.  The (OVS_FORCE ovs_be16) cast seems especially
>     odd since htons() actually returns an ovs_be16.  Same in
>     process_ftp_ctl_v6().  Similarly for htonl(), twice, in handle_ftp_ctl().
> 
> I had simplified the code without removing the htons/l casts, somehow.
>     
>     I wonder whether the parsing in process_ftp_ctl_v4() would be easier
>     using sscanf().
> 
> I had considered it first; the problem is the practical variability of legacy 
> FTP message formats with different servers and clients. It would break in
> corner cases.
>     
>     Again, thanks a lot for implementing this feature!
>     
>     Ben
>     _______________________________________________
>     dev mailing list
>     dev@openvswitch.org
>     https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=27h6skZaWHS36bASXEcQwe42XKJzVxLlkNHSXmK-mBk&s=jSXkenV2kpguUUU_EO31062IXf4CUHPvV-H4Y6dt3JQ&e= 
>     
>
diff mbox

Patch

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 8d40e9eb809d..5c58e660972d 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -123,7 +123,6 @@  repl_ftp_v4_addr(struct dp_packet *pkt, ovs_be32 v4_addr_rep,
 
 static enum ftp_ctl_pkt
 process_ftp_ctl_v4(struct conntrack *ct,
-                   const struct conn_lookup_ctx *ctx,
                    struct dp_packet *pkt,
                    const struct conn *conn_for_expectation,
                    long long now, ovs_be32 *v4_addr_rep,
@@ -132,7 +131,7 @@  process_ftp_ctl_v4(struct conntrack *ct,
 
 static enum ftp_ctl_pkt
 detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
-                    struct dp_packet *pkt, char *ftp_msg);
+                    struct dp_packet *pkt);
 
 static void
 handle_ftp_ctl(struct conntrack *ct, const struct conn_lookup_ctx *ctx,
@@ -2477,9 +2476,8 @@  delinate_number(char *str, uint8_t max_digits)
     return str;
 }
 
-static enum ftp_ctl_pkt
-detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
-                    struct dp_packet *pkt, char *ftp_msg)
+static void
+get_ftp_ctl_msg(struct dp_packet *pkt, char *ftp_msg)
 {
     struct tcp_header *th = dp_packet_l4(pkt);
     char *tcp_hdr = (char *) th;
@@ -2491,6 +2489,13 @@  detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
 
     ovs_strlcpy(ftp_msg, tcp_hdr + tcp_hdr_len,
                 tcp_payload_of_interest);
+}
+
+static enum ftp_ctl_pkt
+detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx, struct dp_packet *pkt)
+{
+    char ftp_msg[LARGEST_FTP_MSG_OF_INTEREST + 1] = {0};
+    get_ftp_ctl_msg(pkt, ftp_msg);
 
     if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
         if (strncasecmp(ftp_msg, FTP_EPRT_CMD, FTP_EPRT_CMD_SIZE) &&
@@ -2510,7 +2515,6 @@  detect_ftp_ctl_mode(const struct conn_lookup_ctx *ctx,
 
 static enum ftp_ctl_pkt
 process_ftp_ctl_v4(struct conntrack *ct,
-                   const struct conn_lookup_ctx *ctx,
                    struct dp_packet *pkt,
                    const struct conn *conn_for_expectation,
                    long long now, ovs_be32 *v4_addr_rep,
@@ -2525,7 +2529,7 @@  process_ftp_ctl_v4(struct conntrack *ct,
     char *ftp = ftp_msg;
     enum ct_alg_mode mode;
 
-     detect_ftp_ctl_mode(ctx, pkt, ftp_msg);
+    get_ftp_ctl_msg(pkt, ftp_msg);
     *ftp_data_v4_start = tcp_hdr + tcp_hdr_len;
 
     if (!strncasecmp(ftp_msg, FTP_PORT_CMD, FTP_PORT_CMD_SIZE)) {
@@ -2652,7 +2656,6 @@  skip_ipv6_digits(char *str)
 
 static enum ftp_ctl_pkt
 process_ftp_ctl_v6(struct conntrack *ct,
-                   const struct conn_lookup_ctx *ctx,
                    struct dp_packet *pkt,
                    const struct conn *conn_for_expectation,
                    long long now,
@@ -2668,7 +2671,7 @@  process_ftp_ctl_v6(struct conntrack *ct,
     char *ftp = ftp_msg;
     struct in6_addr ip6_addr;
 
-     detect_ftp_ctl_mode(ctx, pkt, ftp_msg);
+    get_ftp_ctl_msg(pkt, ftp_msg);
     *ftp_data_start = tcp_hdr + tcp_hdr_len;
 
     if (!strncasecmp(ftp_msg, FTP_EPRT_CMD, FTP_EPRT_CMD_SIZE)) {
@@ -2803,13 +2806,12 @@  handle_ftp_ctl(struct conntrack *ct, const struct conn_lookup_ctx *ctx,
     size_t addr_offset_from_ftp_data_start;
     int64_t seq_skew = 0;
     bool seq_skew_dir;
-    char ftp_msg[LARGEST_FTP_MSG_OF_INTEREST + 1] = {0};
     size_t addr_size = 0;
     char *ftp_data_start;
     bool do_seq_skew_adj = true;
     enum ct_alg_mode mode = CT_FTP_MODE_ACTIVE;
 
-    if (detect_ftp_ctl_mode(ctx, pkt, ftp_msg) != ftp_ctl) {
+    if (detect_ftp_ctl_mode(ctx, pkt) != ftp_ctl) {
         return;
     }
 
@@ -2823,12 +2825,12 @@  handle_ftp_ctl(struct conntrack *ct, const struct conn_lookup_ctx *ctx,
     } else if (ftp_ctl == CT_FTP_CTL_INTEREST) {
         enum ftp_ctl_pkt rc;
         if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
-            rc = process_ftp_ctl_v6(ct, ctx, pkt, conn_for_expectation,
+            rc = process_ftp_ctl_v6(ct, pkt, conn_for_expectation,
                                     now, &v6_addr_rep, &ftp_data_start,
                                     &addr_offset_from_ftp_data_start,
                                     &addr_size, &mode);
         } else {
-            rc = process_ftp_ctl_v4(ct, ctx, pkt, conn_for_expectation,
+            rc = process_ftp_ctl_v4(ct, pkt, conn_for_expectation,
                                     now, &v4_addr_rep, &ftp_data_start,
                                     &addr_offset_from_ftp_data_start);
         }