Commit Graph

44 Commits

Author SHA1 Message Date
25b28e2c1a * libwget/xml.c (xml_context): Add docs to member fields 2024-03-29 12:01:32 +01:00
188989e067 Update copyright year 2024-01-07 12:41:02 +01:00
15ba1db248 Update copyrights 2023-08-31 12:57:30 +02:00
3fb56d60ff Fix parsing HTMP/XML entities in URLs from HTML/XML
* libwget/xml.c (wget_xml_decode_entities_inline): Fix decoding of &#dddd; and &#xhhhh;.
* tests/test-base.c (main): Add test case for &#dddd; and &#xhhhh;.

Reported-by: Michael Roosz
2023-08-27 20:38:17 +02:00
f1074e04a3 * libwget/xml.c: Replace VLA 2023-07-31 18:19:18 +02:00
261d05e082 Decode numeric XML entities
* libwget/xml.c (wget_xml_decode_entities_inline): Decode numeric entities.
* tests/test-base.c (main): Add test.
2022-12-02 13:00:06 +01:00
b89af6d0b8 Introduce XML function to decode entities
* include/wget/wget.h: Add wget_xml_decode_entities_inline().
* libwget/iri.c (iri_unescape_inline): Remove decoding of XML entities.
* libwget/xml.c: Add wget_xml_decode_entities_inline().
* src/wget.c (normalize_uri): Call wget_xml_decode_entities_inline().
2022-09-02 19:51:52 +02:00
0ed4616ad0 Refactor and fix code formatting
* libwget/vector.c: Refactor
* libwget/xml.c: Refactor
* tests/test-include-and-exclude-directories.c: Refactor
* tests/libtest.c: Fix code formatting
2022-06-25 14:09:39 +02:00
3dc7f71098 Update copyright year 2022-02-25 17:46:43 +01:00
20a6046fca Fix issue reported by scan-build
* include/wget/wget.h: /// -> //.
* libwget/netrc.c (unescape_password): Check wget_malloc() return against NULL.
* libwget/xml.c (wget_xml_parse_file): Likewise.
2022-02-19 14:35:40 +01:00
1d260ecfef Add support for unquoted HTML attribute values
* libwget/xml.c (getHTMLValue): New function.
  (getValue): Call getHTMLValue if parsing HTML.
* tests/test-base.c: Use an unquoted HREF value to trigger the new code.
2022-01-01 17:55:36 +01:00
0444fd0e19 Fix parsing comments in <style> content
* libwget/xml.c: Add new function getStyleContent(),
  (parseXML): Use getStyleContent() instead of getContent(),
  update link in comment.
2021-07-05 20:07:15 +02:00
22162f82b2 Update copyright year 2021-01-22 21:58:38 +01:00
24b174fde7 * libwget/xml.c (parseXML): Remove use of VLA 2020-02-25 11:57:39 +01:00
59d9ecd9c0 * Update copyright year to 2020 2020-01-10 00:33:02 +01:00
d3bb446420 * libwget/xml.c: Improve C99 compliancy 2019-09-24 10:51:10 +02:00
4b9edbad2d Update Copyright statements to be compatible with update-copyright module 2019-09-10 23:41:56 +02:00
132f9c76e3 * libwget/xml.c (wget_xml_parse_file): Use stack memory for wget_buffer 2019-08-21 21:46:20 +02:00
2f1ecd3728 Rename wget_xml_callback_t -> wget_xml_callback
* include/wget/wget.h: Rename wget_xml_callback_t -> wget_xml_callback
* libwget/xml.c: Likewise
2019-08-12 11:35:31 +02:00
db50c6a801 Rename wget_buffer_t -> wget_buffer
* include/wget/wget.h: Rename wget_buffer_t -> wget_buffer
* examples/*.c: Likewise
* fuzz/libwget_iri_fuzzer.c: Likewise
* libwget/*.c: Likewise
* src/*.c: Likewise
* tests/*.c: Likewise
* unit-tests/*.c: Likewise
2019-07-18 12:30:57 +02:00
252e1ec283 Typedef XML parse functions not as pointer
* include/wget/wget.h: Remove '*' signature from function typedef
* libwget/xml.c: Amend casts and variable types
2019-07-17 15:23:10 +02:00
5bc488bbac Fix several forgotten xmalloc -> wget_malloc
* libwget/css.c (wget_css_parse_file): Rename xmalloc->wget_malloc
* libwget/hashfile.c (wget_hash_init): Likewise
* libwget/ssl_gnutls.c (_cert_verify_hpkp): Likewise
* libwget/xml.c (wget_xml_parse_file): Likewise
2019-06-21 11:42:01 +02:00
cdb3600791 Update copyrigght to 2019 2019-01-02 12:42:53 +01:00
833895c842 * libwget/xml.c: Fix copyright date (test lgtm.com automatism) [ci skip] 2018-04-11 17:11:32 +02:00
62ea505f64 Use libwget's print-like functions instead the stdio ones
* include/wget/wget.h: Amend return type of wget_bar_print(), wget_bar_vprintf()
* libwget/Makefile.am: Replace fprintf in generated css_tokenizer.c
* libwget/bar.c: Replace *printf by wget_*printf,
  (_bar_set_progress): Generate progress bar with memcpy/memset
* libwget/hpkp.c: Replace *printf by wget_*printf
* libwget/cookie.c: Likewise
* libwget/hpkp.c: Likewise
* libwget/hsts.c: Likewise
* libwget/http.c: Likewise
* libwget/http_parse.c: Likewise
* libwget/io.c: Likewise
* libwget/net.c: Likewise
* libwget/ocsp.c: Likewise
* libwget/pipe.c: Likewise
* libwget/ssl_gnutls.c: Likewise
* libwget/utils.c: Likewise
* libwget/xml.c: Likewise

The intention is to not pull in (large) code parts from libc when
linking statically, the performance is better and we have no
portability issues on non-POSIX platforms (e.g. Windows).
2018-03-28 19:24:28 +02:00
Jay
847ffed69d error handling in XML parsing
* include/wget/wget.h: removed nonnull attribute and defined error code
* libwget/metalink.c: wget_metalink_parse returns null if XML is malformed
* libwget/xml.c: added coditions to detect malformed XML in parseXML
* po/POTFILES.in: localization support for metalink.c
2018-03-08 16:54:45 +01:00
40cef71282 Reduce scope of variables
* libwget/bar.c: Likewise
* libwget/console.c: Likewise
* libwget/encoding.c: Likewise
* libwget/hashmap.c: Likewise
* libwget/io.c: Likewise
* libwget/md5.c: Likewise
* libwget/netrc.c: Likewise
* libwget/ssl_gnutls.c: Likewise
* libwget/xml.c: Likewise
* src/job.c: Likewise
* src/plugin.c: Likewise
* libwget/hashfile.c: Include local private.h
* src/host.c: Remove unused find_free_job_context struct

Found by: cppcheck
2017-10-05 00:27:31 +05:30
b0e1def6fe Add docs for xml parsing functions.
* libwget/xml.c: Rename XML_CONTEXT to _xml_context,
  document wget_* functions
* docs/Makefile.am: Sort order of man page files,
  add building of libwget-xml.3
2017-08-04 11:26:24 +02:00
801714ac52 * libwget/xml.c (getScriptContent): Fix read-off-by-one found by OSS-Fuzz (#2041) 2017-06-04 21:03:42 +02:00
26d4c77d60 * libwget/xml.c (getValue): Fix memleak
Found locally via fuzzing.
2017-05-30 15:53:27 +02:00
f28443ec7a Fix off-by-one read in XML code
* libwget/xml.c (getToken): Fix off-by-one read

Reported-by: oss-fuzz
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=1915
2017-05-30 13:12:33 +02:00
d9af6d266f Open files in binary mode where needed
* examples/websequencediagram.c: Use binary mode on file open
* libwget/css.c: Likewise
* libwget/hashfile.c: Likewise
* libwget/http_highlevel.c: Likewise
* libwget/io.c: Likewise
* libwget/xml.c: Likewise
* src/job.c: Likewise
* src/options.c: Likewise
* src/wget.c: Likewise
* tests/libtest.c: Likewise
* tests/stringmap_perf.c: Likewise

This fixes discrepancies between stat/st_size and read bytes on
MinGW / Wine / Windows.
2017-05-20 22:03:45 +02:00
8a93319b47 Rename strlcpy to wget_strlcpy
* include/wget/wget.h: Rename and remove gnulib deps.
* libwget/strlcpy.c: Rename function and fall back to strlcpy.
* */*.c: Rename strlcpy to wget_strlcpy.
2017-05-02 11:36:51 +02:00
c6ee3d79ad Fix syntax-check 'sc_prohibit_have_config_h'
* cfg.mk: Remove sc_prohibit_have_config_h from local-checks-to-skip
* libwget/*.c: Include <config.h> unconditionally
* src/*.c: Likewise
* tests/*.c: Likewise
2017-04-30 22:01:34 +02:00
31762cc633 Added support for html css style tag and attribute
* include/wget/wget.h: Add 'size_t len' to wget_css_parse_buffer()
  and wget_css_get_urls()
* libwget/css.c (wget_css_parse_buffer): Use yy_scan_bytes() instead
  of yy_scan_string()
* libwget/css_url.c: Add param to wget_css_get_urls() and
  wget_css_parse_buffer()
* libwget/html_url.c: Add _css_parse_encoding() and _css_parse_uri(),
  (_html_get_url): Deal with 'style' attribute,
  (wget_html_free_urls_inline): Free url if needed
* libwget/test_linking.c: Add param to wget_css_parse_buffer()
* libwget/xml.c (parseXML): Add check for 'style'
* src/wget.c (css_parse): Add param 'size_t len'
* tests/Makefile.am: Add new test test-parse-html-css
* tests/test-parse-html-css.c: New file
2017-04-13 10:06:25 +02:00
ec396c577f Fix URLs to HTTPS where possible 2017-02-28 15:31:30 +01:00
bfcd65c12b Use typedefs for function pointer arguments
* include/wget/wget.h: Add typedefs
* libwget/cookie.c: Use typedefs
* libwget/css.c: Likewise
* libwget/css_url.c: Likewise
* libwget/decompressor.c: Likewise
* libwget/hashmap.c: Likewise
* libwget/hsts.c: Likewise
* libwget/http.c: Likewise
* libwget/init.c: Likewise
* libwget/io.c: Likewise
* libwget/list.c: Likewise
* libwget/logger.c: Likewise
* libwget/metalink.c: Likewise
* libwget/net.c: Likewise
* libwget/netrc.c: Likewise
* libwget/ocsp.c: Likewise
* libwget/private.h: Likewise
* libwget/robots.c: Likewise
* libwget/stringmap.c: Likewise
* libwget/tls_session.c: Likewise
* libwget/vector.c: Likewise
* libwget/xml.c: Likewise
* src/blacklist.c: Likewise
* src/host.c: Likewise
* src/options.c: Likewise
* src/wget.c: Likewise
* tests/stringmap_perf.c: Likewise
* tests/test.c: Likewise
2017-01-23 14:43:17 +01:00
9568b0c87e Rename include/libwget.h to include/wget/wget.h+wgetver.h 2016-09-30 09:47:32 +02:00
dd3c2f63b2 Updated copyright year for all relevant files 2016-01-25 13:06:21 +01:00
1e8c3848d0 Call wget_str(n)casecmp_ascii instead of str(n)casecmp
* examples/print_css_urls2.c, libwget/cookie.c, libwget/css.c,
  libwget/html_url.c, libwget/http.c, libwget/iri.c, libwget/metalink.c,
  libwget/net.c, libwget/robots.c, libwget/ssl_gnutls.c,
  libwget/stringmap.c, libwget/xml.c, src/job.c, src/options.c, src/wget.c:
  Call wget_str(n)casecmp_ascii instead of str(n)casecmp
2016-01-11 16:26:25 +01:00
7533d7286b Add define ascii_isspace and ascii_isalpha
* libwget/xml.c: Add and use defines here
2015-11-10 16:24:59 +01:00
ede0d0cc9b Replace calls to strncpy by memcpy
* libwget/xml.c (parseXML): Replace calls to strncpy()
2015-11-10 11:23:14 +01:00
0f8e49128a Transfer copyright to Free Software Foundation, Inc. 2015-09-22 11:50:06 +02:00
c6b0e461a1 Transform Mget into Wget 2015-09-19 22:54:38 +02:00