Gnulib does not currently implement the `F_GETFL` and `F_SETFL` flags on
mingw. As a result building Wget failed on Windows. We don't currently
have a good solution to this problem and Windows users of Wget will not
get the new functionality of non-blocking file I/O reads. The current
solution is simply to provide a Windows specific no-op stub
* src/mswindows.c (set_fd_nonblocking): Provide an empty stub function
for Windows systems
* src/mswindows.h (set_fd_nonblocking): Export the function so it is
available in util.c where it is needed
* src/util.c (set_fd_nonblocking): New function to set a fd as
O_NONBLOCK on non-Windows, non-MSDOS systems
(wget_read_from_file): Split the O_NONBLOCK code into
set_fd_nonblocking and make all file I/O non-blocking, not just stdin
>= GCC 12 reports an instance of -Warray-bounds in setval_internal_tilde
with the default -DNDEBUG:
```
In function ‘setval_internal_tilde’,
inlined from ‘run_wgetrc’ at init.c:710:16:
init.c:940:17: error: array subscript [0, 167] is outside array bounds of ‘const struct <anonymous>[168]’ [-Werror=array-bounds=]
940 | if (((commands[comind].action == cmd_file) ||
| ~~~~~~~~^~~~~~~~
init.c: In function ‘run_wgetrc’:
init.c:135:3: note: while referencing ‘commands’
135 | } commands[] = {
| ^~~~~~~~
```
setval_internal_tilde calls setval_internal and stores the result in ret;
setval_internal *does* check for if comind is out-of-bounds, but we only
check that *after* dereferencing commands[comind]. Swap the order in the
if() to fix that so we only dereference if we know it's safe.
ChangeLog:
* src/init.c (setval_internal_tilde): Check 'ret' earlier.
Copyright-paperwork-exempt: Yes
* src/url.c (url_skip_credentials): Properly re-implement userinfo parsing (rfc2396)
The reason why the implementation is based on RFC 2396, an outdated standard,
is that the whole file is based on that RFC, and mixing standard here might be
dangerous.
Rather than reading from stdin only once, leave the pipe open until the
other end closes it and keep reading from the file after each set of
URLs is read
* src/html-url.h(get_urls_file): Update prototype to add additional
param
* src/html-url.c(get_urls_file): Pass through read_again to
wget_read_from_file.
* src/retr.c(retrieve_from_file): Split the function into two. Introduce
`retrieve_from_url_list` that actually performs the retrieval.
Also, if `url_list` returns that the fd has been left open, then
continue reading from it until the fd is closed.
(retrieve_from_url_list): New function that does the retrieval from
a list of URLs that was read from a file.
* src/utils.c(wget_read_from_file): Rename old function `wget_read_file`
to this.
Accept an additional output parameter that states whether the fd was
left open and if we should continue reading from it after the current
set of URLs have been processed
(wget_read_file): Write it as a new wrapper function around
`wget_read_from_file` to maintain API comptability across other users
The definition of debug_logprintf in src/log.c is guarded by ENABLE_DEBUG
(although its prototype is unconditionally available in src/log.h).
The uses of debug_logprintf in src/retr.c aren't guarded by ENABLE_DEBUG.
Use the DEBUGP macro which is designed for this purpose.
* src/retr.c (getproxy): Use DEBUGP macro.
Fixes: https://gitlab.com/gnuwget/wget/-/issues/19
Copyright-paperwork-exempt: Yes
* testenv/Test-recursive-pathmax.py: Add a new testcase. This test tries
to check that Wget allows downloading long filenames as far as allowed
by the OS and filesystem.
* tests/Makefile.am: Remove some tests that are redundant with the
Python testenv
* tests/Test-auth-basic.px: Delete file
* tests/Test-auth-no-challenge.px: Same
* tests/Test-auth-no-challenge-url.px: Same
* tests/Test-auth-retcode.px: Same
* tests/Test-auth-with-content-disposition.px: Same
* tests/Test-k.px: Same