Message ID | 20220510095553.4056033-1-daniel@dd-wrt.com |
---|---|
State | Superseded |
Delegated to: | Hauke Mehrtens |
Headers | show |
Series | umbim: fix invalid mbim message string encoding | expand |
Daniel Danzberger <daniel@dd-wrt.com> writes: > Strings in mbim messages have to follow these formatting rules: > - 4 byte alignment, padded if not. ack. MBIM spec section 10.3: All fields (either static or variable size) start at a 4-byte boundary. Hence, all field offsets shall be multiples of 4. Between the end of a field (not necessarily at a 4-byte boundary) and the beginning of the next field (always at a 4-byte boundary), padding may be required. Padding bytes must be set to 0 upon transmission and ignored upon reception. .. If the size of the payload in the variable field is not a multiple of 4 bytes, the field shall be padded up to the next 4 byte multiple. This shall be true even for the last payload in DataBuffer. > - utf-16 little endian. ack. MBIM spec section 10.4: Unless otherwise specified, all strings use UNICODE UTF-16LE encodings limited to characters from the Basic Multilingual Plane. Strings shall not be terminated by a NULL character. > + /* convert to utf-16 little endian */ > for (i = 0; i < l; i++) > - p[i * 2] = in[i]; > + p[i * 2] = htole16(in[i]); > > return 0; > } This new code is buggy. It fails on BE and makes no difference on LE. Both p and in are byte arrays. The byte you write into p on a BE system will be aither 0x00 or 0xff depending on the MSB of in[i]. The previous code was sort of correct, assuming that the input is mostly ascii mapping to the same value in unicode. Ideally we should do real utf16le conversion of the input. But then someone has to decide if the input is utf8 or something else first. And you need a real utf16 implementation. Not sure it's worth it. Did you ever see an operator use a non-ascii APN, username, password or pin code? Bjørn
On Tue, 2022-05-10 at 12:57 +0200, Bjørn Mork wrote: > Unless otherwise specified, all strings use UNICODE UTF-16LE > encodings limited to characters from the Basic Multilingual > Plane. Strings shall not be terminated by a NULL character. > > > + /* convert to utf-16 little endian */ > > for (i = 0; i < l; i++) > > - p[i * 2] = in[i]; > > + p[i * 2] = htole16(in[i]); > > > > return 0; > > } > > > This new code is buggy. It fails on BE and makes no difference on > LE. Both p and in are byte arrays. The byte you write into p on a BE > system will be aither 0x00 or 0xff depending on the MSB of in[i]. True, I will remove that le16 conversion part and retest. > > The previous code was sort of correct, assuming that the input is mostly > ascii mapping to the same value in unicode. Ideally we should do real > utf16le conversion of the input. But then someone has to decide if the > input is utf8 or something else first. And you need a real utf16 > implementation. Not sure it's worth it. Did you ever see an operator > use a non-ascii APN, username, password or pin code? Agree, it's best for now to just leave it as is.
diff --git a/mbim-msg.c b/mbim-msg.c index 5ec04f4..8465091 100644 --- a/mbim-msg.c +++ b/mbim-msg.c @@ -53,8 +53,10 @@ mbim_add_payload(uint8_t len) int mbim_encode_string(struct mbim_string *str, char *in) { - int l = strlen(in); - int s = mbim_add_payload(l * 2); + const int l = strlen(in); + const int utf16_len = l * 2; + const int pad_len = utf16_len % 4; + const int s = mbim_add_payload(utf16_len + pad_len); uint8_t *p = &payload_buffer[s]; int i; @@ -62,14 +64,15 @@ mbim_encode_string(struct mbim_string *str, char *in) return -1; str->offset = htole32(s); - str->length = htole32(l * 2); + str->length = htole32(utf16_len); + + /* convert to utf-16 little endian */ for (i = 0; i < l; i++) - p[i * 2] = in[i]; + p[i * 2] = htole16(in[i]); return 0; } - char * mbim_get_string(struct mbim_string *str, char *in) {
Strings in mbim messages have to follow these formatting rules: - 4 byte alignment, padded if not. - utf-16 little endian. Fixes: - mbim connect fails with more than 1 string parameter (apn/user/pass) when they are not 4 byte aligned. Signed-off-by: Daniel Danzberger <daniel@dd-wrt.com> --- mbim-msg.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)