* bootstrap.conf: Ad gnulib module ftruncate
* include/wget/wget.h: Add wget_truncate
* libwget/io.c: Add wget_truncate
* src/job.c: Call wget_truncate instead of truncate
The truncate() function is not covered by gnulib.
wget_truncate() is a wrapper around ftruncate(), which is covered
by gnulib.
* include/wget/wget.h: Make library symbol export attribute
((__visibility__)) preceed the function prototype to work on Windows
with MSVC
* tests/libtest.h: Same
Reported-by: Gisle Vanem
* include/wget/wget.h: Remove preprocessor definitions for xmalloc,
xcalloc and xrealloc
* src/job.c(job_validate_file): Replace xcalloc with wget_calloc
* src/wget.c(main): Same
(process_head_response): Same
(_prepare_file): Same
(http_send_request): Same
* bootstrap.conf: Add gnulib module 'inet_pton'
* docs/Makefile.am: Add man page libwget-ip.3
* wget/wget.h (struct wget_iri_st): New field 'is_ip_address'.
* libwget/Makefile.am: Add new file ip.c
* libwget/ip.c: New file defining new function wget_ip_is_family()
* libwget/iri.c (wget_iri_parse): Check for literal IP address
* src/wget.c: Do not add to HSTS if literal IP address
* include/wget/wget.h (wget_base64_get_decoded_length): new function
* libwget/tls_session.c (_tls_session_db_load): use
wget_base64_get_decoded_length().
* configure.ac: Suppress -Wunsuffixed-float-constants for gcc,
add -Wno-disabled-macro-expansion to clang flags
* examples/getstream.c: Removed 'unused' attributes from argc and argv.
* include/wget/wget.h: Fix slot to int for bar functions.
* libwget/bar.c: Fix slot to int for bar functions, cast to int,
change type of bitfield 'redraw' to unsigned.
* libwget/hsts.c (wget_hsts_new): Fix int to bitfield conversion.
* libwget/http.c: Cast printf %p format to (void *).
* src/wget.c: Cast printf %p format to (void *),
remove ACTION_DONE from enum actions.
* src/wget_host.h: Fix double declaration of host_remove_job()
* tests/libtest.c (wget_test_check_filesystem): Fix type of 'rc'.
* tests/test-idn-robots.c: Fix comma to semikolon at end of line.
* tests/test.c (test_strcasecmp_ascii): Fix type of loop variable.
* include/wget/wget.h: wget_get_screen_size() must not be pure
* libwget/bar.c (_bar_get_width): Leave one space right to avoid linebreaks
* libwget/utils.c (wget_human_readable): Align # of bytes
* src/bar.c (wget_get_screen_size): Fix return value
* include/wget/wget.h: New function wget_get_screen_size. Replaces old
wget_determine_screen_width
* libwget/bar.c (_bar_get_width): Use wget_get_screen_size instead of
wget_determine_screen_width
* libwget/utils.c: Replace function wget_determine_screen_width with a
more generic wget_get_screen_size.
* include/wget/wget.h: New function wget_bar_screen_resized, as an
indicator that the screen size may have changed. Also, wget_bar_update
may now modify the *bar parameter
* libwget/bar.c: Move detection code for screen size into a separate
function called _bar_get_width.
(wget_bar_update): Check if the screen width has changed and
reallocate more space if necessary
* src/wget.c: Install a new signal handler for SIGWINCH
* include/wget/wget.h: Declare new function wget_determine_screen_width.
Also change prototype of wget_bar_init to not accept a max_width
parameter
* src/utils.c: Move function determine_screen_width from here ...
* libwget/utils.c: ... to here.
* src/utils.{c,h}: Delete empty file
* src/Makefile.am: Remove source files utils.{c,h}
* src/bar.c: Move detection of screen width from here ...
* libwget/bar.c: ... to here
* tests/test.c: wget_bar_init() no longer takes a max_width parameter
* include/wget/wget.h (_wget_bar_ctx): Remove unused elements
expected_size and filename. Rename slotpos to _slotpos to reflect that
it should not be used by the client.
* include/wget/wget.h: Remove declaration for old API functions
bar_register and bar_deregister. Define declarations for
wget_bar_slot_begin, wget_bar_slot_register and
wget_bar_slot_deregister.
* libwget/bar.c (_bar_slot_status_t): Define new enum to store the
status of a slot of the progress bar.
(_bar_slot_t): Remove unused element last_ctx. Also, define slot
elements: filename, file_size, bytes_downloaded and status.
(wget_bar_slot_begin): Define new function. Invoked when the client
has started downloading a new file.
(wget_bar_slot_register): New function to register a new file in a
progress bar slot.
(wget_bar_deregister): Remove old function
(wget_bar_slot_deregister): New function, called after file has
finished downloading. Successfully or unsuccessfully.
(wget_bar_deregister): Remove old function
(_bar_set_progress): last_ctx is no longer used. Instead data is
stored within the slot struct itself
(_bar_update_slot): Same
(_bar_print_final): Same
(wget_bar_deinit): Free the filename stored in the slot
* src/bar.c: Edit to reflect the changes in libwget API
* src/wget.c (_get_header): Same
(http_send_request): Same
(http_receive_response): Same
* src/wget_bar.h: Same
* include/libwget.h.in: Add new function wget_bar_set_slots
* libwget/bar.c (_wget_bar_st): Add new struct member max_slots to store
the maximum number of slots of the progress bar
(wget_bar_init): Initialize the memory for all the possible
progress_bars
(wget_bar_set_slots): New function to modify the number of progress
slots at runtime. Can only increase the number of slots, not decrease.
* src/bar.c(bar_init): Remove unneeded variable lf. bar_init() is
invoked before any threads are created, hence do not print the required
newlines. This is now done by wget_bar_set_slots
(bar_update_slots): Wget2 interface for libwget's wget_bar_set_slots
(_error_write): Fix the position of acquiring the mutex. Don't make
changes to the screen while some other progress bar function may be
running
* src/bar.h: New function bar_update_slots
* src/wget.c: Create global variable nthreads.
(downloader_thread): Update the number of progress bar slots
* src/wget.h: New global variable nthreads
* bootstrap.conf: Remove module 'human'
* include/libwget.h.in: Amend wget_human_readable()
* libwget/bar.c: Use new function wget_human_readable()
* libwget/utils.c: Replace wget_human_readable()
* src/wget.c: Use new function wget_human_readable()
* include/libwget.h.in: Add wget_human_readable_alloc()
* libwget/utils.c: Add wget_human_readable_alloc()
* libwget/bar.c: Use wget_human_readable_alloc()
* src/bar.c (bar_init): Set log function after call to wget_bar_init()
* src/wget.c (main): Use wget_human_readable() to print bottom line of bar
* include/libwget.h.in: Add function wget_list_getnext().
* libwget/list.c: Add function wget_list_getnext().
* libwget/robots.c: Fix memory leak.
* src/host.c (host_remove_job): Cleanup queue after downloading and
scanning robots.txt.
* src/job.h (struct JOB): Add flag 'requested_by_user'.
* src/wget.c (add_url_to_queue): Set 'requested_by_user',
(add_url): Fix checking for disallowed paths.
* tests/Makefile.am: Add test 'test-robots'.
* tests/test-robots.c: New test to prove robots functionality.
Special handling for automatic robots.txt jobs
==============================================
What can happen with --recursive and --span-hosts is that a document from hostA
has links to hostB. All these links might go into the hostB queue before robots.txt
is downloaded and parsed. To avoid downloading of 'disallowed' documents, the queue
for hostB has to be cleaned up right after downloading and parsing robots.txt.
Any links links that have been explicitly requested by the user are still downloaded.
* bootstrap.conf: Add the human module and its dependency unistd
* include/libwget.h.in: Replace method signature for wget_human_readable
* libwget/bar.c (_bar_slot_t): Add a new struct memeber for storing the
human readable string that it printed on the progress bar
(wget_bar_init): Initialise the new human_size struct member
(wget_bar_update): Use the new wget_human_readable() method
(_bar_print_final): Same
(wget_bar_deinit): Free the allocated memory for each slot
* libwget/utils.c (wget_human_readable): Replace the old function with a
call to the human_readable() method from gnulib
When the progress bar is updated for the last time (whether exiting due
to error, or download complete), we may want to print information not
normally presented in the progress bar, like total time taken. To allow
this, separate out the printing of the last update of the progress bar
* include/libwget.h.in (_wget_bar_ctx): Remove unused members, final and
cond
* libwget/bar.c: Separate out the bar slot selection for reuse.
(_bar_print_final): New method for printing the last update of a
progress bar
* include/libwget.h.in: Export new function wget_thread_cancel to kill
an existing thread nicely.
* libwget/thread.c: Same
* src/bar.c: Use the cleaner wget_thread_cancel() method to kill the
progress bar instead of polling for a variable.
Give the progress bar its own thread and let it update the entire
display asynchronously at a specified time interval. The existing
implementation refreshes the progress bar for each network packet
downloaded. Over multiple downloader threads and a high speed network
connection this can lead to far too many redrawings of the screen. Also,
each of the downloader threads will block while trying to acquire the
thread mutex because another thread just retrieved a packet. While I
haven't profiled it, it seems like there would be extremely high lock
contention in the existing implementation. Instead, with a separate
thread, we can update all the progress bar slots simultaneously at
regular intervals.
* bootstrap.conf: Include Gnulib module "ioctl"
* include/libwget.h.in: Define always_inline, flatten and deprecated
compiler attributes for both GCC and Clang
* include/libwget.h.in: Export new functions wget_human_readable,
wget_bar_register and wget_bar_deregister.
* include/libwget.h.in (wget_http_response_t): Add new element
"cur_downloaded" to struct. This element keeps a track of the raw
number of bytes downloaded in the response
(_wget_bar_ctx): Define new struct for storing the progress bar
context.
* libwget/bar.c: Fix display of downloaded ratio. Ensure it does not
exceed 100% by comparing the raw downloaded bytes instead of
uncompressed downloaded bytes.
Add support for printing the filename and downloaded bytes to the
progress bar.
* libwget/http.c (wget_http_response_cb): Store the raw number of
bytes downloaded from the network in the response data
* libwget/utils.c (wget_human_readable): New function to convert an
int value to a human readable string
* src/bar.c: Use the entire screen width instead of just 70 columns
for the progress bar.
Provide methods to register and deregister a progress bar
context
(wget_bar_update_thread): New method for the progress bar thread
to update the output periodically.
* src/bar.h: Same
* src/job.h: Move definition of struct DOWNLOADER from main.c to
here
* src/progress.h: Move definition of _body_callback_context from
main.c to here
* src/utils.c: Add new method to determine the width of the screen
* src/utils.h: Same
* src/wget.c: Use the new progress bar contexts. Update, register
and deregister them
(main): Deinit the progress bar only if it was enabled
* src/Makefile.am: Add new files, progress.h, utils.c, utils.h
* include/libwget.h.in: Add wget_tls_session_* set of functions,
add new define WGET_SSL_SESSION_CACHE
* libwget/Makefile.am: Add tls_session.c to libwget
* libwget/ssl_gnutls.c (struct _config): Add tls_session_cache.
(struct _session_context): Add delayed_session_data.
(wget_ssl_set_config_string): Handle WGET_SSL_SESSION_CACHE.
(wget_ssl_open): Add False Start debug messages,
Set session data if cound in cache,
Write session data into cache.
(wget_ssl_read_timeout): Write session data into cache when
False Start has been activated.
* libwget/tls_session.c: New file with session data cache implementation
* src/options.c (print_help): Add new options --tls-resume and
--tls-session-file.
(config): Set tls_resume activated by default.
(options[]): Add tls-resume and tls-session-file.
(init): Create session file name, init session data cache,
Set session cache for TLS layer.
(deinit): Free session cache and session cache filename.
* src/options.h (struct config): Add tls_session_db, tls_session_file,
tls_resume.
* src/wget.c (main): Save session cache if changed.
TLS Session Resumption speeds up TLS handshake by 1xRTT.
Together with TCP Fast Open (enabled by default), we are down at
1xRTT TLS handshake overhead. To get down to 0 RTT we need a different
TLS protocol - TLS v1.3 will hopefully soon be standardized.
* include/libwget.h.in: New declaration wget_tcp_get_tcp_fastopen()
* libwget/net.c: New function wget_tcp_get_tcp_fastopen(),
(wget_tcp_connect): Support TFO
* libwget/ssl_gnutls.c: New function _ssl_writev()
(wget_ssl_open): Set _ssl_writev() as push function in TFO requested
* .gitignore: New entry examples/http_multi_get
* Makefile.am: Move subdir 'examples' before 'src'
* examples/Makefile.am: Add http_multi_get
* examples/http_get2.c (main): Adapt code to API changes
* examples/http_multi_get.c: New example to show async/parallel
requests
* examples/websequencediagram.c (main): Adapt code to API changes
* include/libwget.h.in: New functions wget_http_request_set_header_cb,
wget_http_request_set_body_cb, wget_http_request_set_int.
Changed params of wget_http_get_response.
Removed wget_http_get_response_func, wget_http_get_response_stream,
wget_http_get_response_fd.
Changed params of wget_http_get_response_cb.
Add members pending_requests, received_http2_responses,
pending_http2_requests to wget_http_connection_t.
Add member req to wget_http_response_t.
Add members header_callback, body_callback, header_user_data,
body_user_data, response_keepheader to wget_http_request_t.
New function types wget_http_header_callback_t and
wget_http_body_callback_t.
* libwget/http.c: Implementation of the changes in include/libwget.h.in.
* libwget/http_highlevel.c: Adapt code to API changes
* libwget/ssl_gnutls.c: Fixed defaults for SNI, OCSP and
ALPN.
Adapt code to API changes.
* src/options.c: Reflect changes in libwget/ssl_gnutls.c
* src/wget.c: Adapt code to API changes
* include/libwget.h.in: Adjust wget_random() return type to int
* libwget/random.c (libwget/random.c): Use initstate_r() for seeding,
(wget_srandom): Use initstate_r() for seeding
* examples/print_html_urls.c (html_parse_localfile):
Detect BOM (Byte Order Mark), convert UTF-16 to UTF-8 before parsing.
* include/libwget.h.in: Remove wget_charset_transcode(),
add wget_memiconv() and wget_striconv().
* libwget/encoding.c: Implement wget_memiconv() and wget_striconv()
* src/wget.c (html_parse): Add 'html_len' param,
convert UTF-16 to UTF-8 before parsing.
* tests/test.c: New test for wget_memiconv().
* include/libwget.h.in: Update prototype of wget_robots_parse()
* libwget/robots.c (wget_robots_parse): Add client name as parameter
* src/wget.c: Call wget_robots_parse() with PACKAGE_NAME as client name
* include/libwget.h.in (wget_bar_init, wget_bar_update,
wget_bar_print): Change types to int and off_t.
* libwget/bar.c (wget_bar_init, wget_bar_update,
wget_bar_print): Change types to int and off_t.
* src/wget.c (_prepare_file): Add prototype. Change type of rc to
size_t.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>