Commit Graph

10 Commits

Author SHA1 Message Date
483c304aef Support parsing of UTF-16 HTML documents
* examples/print_html_urls.c (html_parse_localfile):
  Detect BOM (Byte Order Mark), convert UTF-16 to UTF-8 before parsing.
* include/libwget.h.in: Remove wget_charset_transcode(),
  add wget_memiconv() and wget_striconv().
* libwget/encoding.c: Implement wget_memiconv() and wget_striconv()
* src/wget.c (html_parse): Add 'html_len' param,
  convert UTF-16 to UTF-8 before parsing.
* tests/test.c: New test for wget_memiconv().
2016-04-24 15:08:20 +02:00
dd3c2f63b2 Updated copyright year for all relevant files 2016-01-25 13:06:21 +01:00
0f8e49128a Transfer copyright to Free Software Foundation, Inc. 2015-09-22 11:50:06 +02:00
c6b0e461a1 Transform Mget into Wget 2015-09-19 22:54:38 +02:00
6cdd208444 fix several issues found by Coverity 2015-03-07 22:55:17 +01:00
a85b163ee9 added --follow-tags and --ignore-tags 2014-09-09 13:18:32 +02:00
7f131b0ef5 added -k/--convert-links and -K/--backup-converted 2014-07-17 17:12:44 +02:00
2b6645bef1 fixed HTML parsing of uppercase attributes (fixes issue #21) 2014-06-24 10:59:29 +02:00
9c8f45d38d fixed some clang 3.4 warnings 2014-01-11 14:21:05 +01:00
a5f21a0ac9 added example print_html_url.c (scanning and printing URLs from HTML files) 2014-01-03 17:28:50 +01:00