trurl-0.16.1/0000775000000000000000000000000015010312005007662 5ustar00trurl-0.16.1/.checksrc0000644000000000000000000000002215010312005011440 0ustar00disable FOPENMODE trurl-0.16.1/CONTRIBUTING.md0000664000000000000000000001167015010312005012120 0ustar00# Contributing to trurl This document is intended to provide a framework for contributing to trurl. This document will go over requesting new features, fixing existing bugs and effectively using the internal tooling to help PRs merge quickly. ## Opening an issue trurl uses GitHubs issue tracking to track upcoming work. If you have a feature you want to add or find a bug simply open an issue in the [issues tab](https://github.com/curl/trurl/issues). Briefly describe the feature you are requesting and why you think it may be valuable for trurl. If you are reporting a bug be prepared for questions as we will want to reproduce it locally. In general providing the output of `trurl --version` along with the operating system / Distro you are running is a good starting point. ## Writing a good PR trurl is a relatively straightforward code base, so it is best to keep your PRs straightforward as well. Avoid trying to fix many bugs in one PR, and instead use many smaller PRs as this avoids potential conflicts when merging. trurl is written in C and uses the [curl code style](https://curl.se/dev/code-style.html). PRs that do not follow to code style will not be merged in. trurl is in its early stages, so it's important to open a PR against a recent version of the source code, as a lot can change over a few days. Preferably you would open a PR against the most recent commit in master. If you are implementing a new feature, it must be submitted with tests and documentation. The process for writing tests is explained below in the tooling section. Documentation exists in two locations, the man page ([trurl.1](https://github.com/curl/trurl/blob/master/trurl.1)) and the help prompt when running `trurl -h`. Most documentation changes will go in the man page, but if you add a new command line argument then it must be documented in the help page. It is also important to be prepared for feedback on your PR and adjust it promptly. ## Tooling The trurl repository has a few small helper tools to make development easier. **checksrc.pl** is used to ensure the code style is correct. It accepts C files as command line arguments, and returns nothing if the code style is valid. If the code style is incorrect, checksrc.pl will provide the line the error is on and a brief description of what is wrong. You may run `make checksrc` to scan the entire repository for style compliance. **test.py** is used to run automated tests for trurl. It loads in tests from `test.json` (described below) and reports the number of tests passed. You may specify the tests to run by passing a list of comma-separated numbers as command line arguments, such as `4,8,15,16,23,42` Note there is no space between the numbers. `test.py` may also use valgrind to test for memory errors by passing `--with-valgrind` as a command line argument, it should be noted that this may take a while to run all the tests. `test.py` will also skip tests that require a specific curl runtime or buildtime. ### Adding tests Tests are located in [tests.json](https://github.com/curl/trurl/blob/master/tests.json). This file is an array of json objects when outline an input and what the expected output should be. Below is a simple example of a single test: ```json { "input": { "arguments": [ "https://example.com" ] }, "expected": { "stdout": "https://example.com/\n", "stderr": "", "returncode": 0 } } ``` `"arguments"` is an array of the arguments to run in the test, so if you wanted to pass multiple arguments it would look something like: ```json { "input": { "arguments": [ "https://curl.se:22/", "-s", "port=443", "--get", "{url}" ] }, "expected": { "stdout": "https://curl.se/\n", "stderr": "", "returncode": 0 } } ``` trurl may also return json. It you are adding a test that returns json to stdout, write the json directly instead of a string in the examples above. Below is an example of what stdout should be if it is a json test, where `"input"` is what trurl accepts from the command line and `"expected"` is what trurl should return. ```json "expected": { "stdout": [ { "url": "https://curl.se/", "scheme": "https", "host": "curl.se", "port": "443", "raw_port": "", "path": "/", "query": "", "params": [] } ], "returncode": 0, "stderr": "" } ``` # Tips to make opening a PR easier - Run `make checksrc` and `make test-memory` locally before opening a PR. These ran automatically when a PR is opened so you might as well make sure they pass before-hand. - Update the man page and the help prompt accordingly. Documentation is annoying but if everyone writes a little it's not bad. - Add tests to cover new features or the bug you fixed. trurl-0.16.1/COPYING0000664000000000000000000000210015010312005010706 0ustar00COPYRIGHT AND PERMISSION NOTICE Copyright (c) 2023 - 2024, Daniel Stenberg, , and many contributors, see the THANKS file. All rights reserved. Permission to use, copy, modify, and distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. Except as contained in this notice, the name of a copyright holder shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Software without prior written authorization of the copyright holder. trurl-0.16.1/Makefile0000664000000000000000000000471615010312005011332 0ustar00########################################################################## # _ _ ____ _ # Project ___| | | | _ \| | # / __| | | | |_) | | # | (__| |_| | _ <| |___ # \___|\___/|_| \_\_____| # # Copyright (C) Daniel Stenberg, , et al. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at https://curl.se/docs/copyright.html. # # You may opt to use, copy, modify, merge, publish, distribute and/or sell # copies of the Software, and permit persons to whom the Software is # furnished to do so, under the terms of the COPYING file. # # This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY # KIND, either express or implied. # # SPDX-License-Identifier: curl # ########################################################################## TARGET = trurl OBJS = trurl.o ifndef TRURL_IGNORE_CURL_CONFIG LDLIBS += $$(curl-config --libs) CFLAGS += $$(curl-config --cflags) endif CFLAGS += -W -Wall -Wshadow -pedantic CFLAGS += -Wconversion -Wmissing-prototypes -Wwrite-strings -Wsign-compare -Wno-sign-conversion ifndef NDEBUG CFLAGS += -Werror -g endif MANUAL = trurl.1 PREFIX ?= /usr/local BINDIR ?= $(PREFIX)/bin MANDIR ?= $(PREFIX)/share/man/man1 ZSH_COMPLETIONSDIR ?= $(PREFIX)/share/zsh/site-functions COMPLETION_FILES=scripts/_trurl.zsh INSTALL ?= install PYTHON3 ?= python3 all: $(TARGET) $(MANUAL) $(TARGET): $(OBJS) $(CC) $(LDFLAGS) $(OBJS) -o $(TARGET) $(LDLIBS) trurl.o: trurl.c version.h $(MANUAL): trurl.md ./scripts/cd2nroff trurl.md > $(MANUAL) .PHONY: install install: $(INSTALL) -d $(DESTDIR)$(BINDIR) $(INSTALL) -m 0755 $(TARGET) $(DESTDIR)$(BINDIR) (if test -f $(MANUAL); then \ $(INSTALL) -d $(DESTDIR)$(MANDIR); \ $(INSTALL) -m 0644 $(MANUAL) $(DESTDIR)$(MANDIR); \ fi) (if test -f $(COMPLETION_FILES); then \ $(INSTALL) -d $(DESTDIR)$(ZSH_COMPLETIONSDIR); \ $(INSTALL) -m 0755 $(COMPLETION_FILES) $(ZSH_COMPLETIONSDIR)/_trurl; \ fi) .PHONY: clean clean: rm -f $(OBJS) $(TARGET) $(COMPLETION_FILES) $(MANUAL) .PHONY: test test: $(TARGET) @$(PYTHON3) test.py .PHONY: test-memory test-memory: $(TARGET) @$(PYTHON3) test.py --with-valgrind .PHONY: checksrc checksrc: ./scripts/checksrc.pl trurl.c version.h .PHONY: completions completions: trurl.md ./scripts/generate_completions.sh $^ trurl-0.16.1/README.md0000664000000000000000000001070715010312005011146 0ustar00 # [![trurl logo](https://curl.se/logo/trurl-logo.svg)](https://curl.se/trurl) # trurl Command line tool for URL parsing and manipulation [Video presentation](https://youtu.be/oDL7DVszr2w) ## Examples **Replace the hostname of a URL:** ```text $ trurl --url https://curl.se --set host=example.com https://example.com/ ``` **Create a URL by setting components:** ```text $ trurl --set host=example.com --set scheme=ftp ftp://example.com/ ``` **Redirect a URL:** ```text $ trurl --url https://curl.se/we/are.html --redirect here.html https://curl.se/we/here.html ``` **Change port number:** ```text $ trurl --url https://curl.se/we/../are.html --set port=8080 https://curl.se:8080/are.html ``` **Extract the path from a URL:** ```text $ trurl --url https://curl.se/we/are.html --get '{path}' /we/are.html ``` **Extract the port from a URL:** ```text $ trurl --url https://curl.se/we/are.html --get '{port}' 443 ``` **Append a path segment to a URL:** ```text $ trurl --url https://curl.se/hello --append path=you https://curl.se/hello/you ``` **Append a query segment to a URL:** ```text $ trurl --url "https://curl.se?name=hello" --append query=search=string https://curl.se/?name=hello&search=string ``` **Read URLs from stdin:** ```text $ cat urllist.txt | trurl --url-file - ... ``` **Output JSON:** ```text $ trurl "https://fake.host/hello#frag" --set user=::moo:: --json [ { "url": "https://%3a%3amoo%3a%3a@fake.host/hello#frag", "parts": { "scheme": "https", "user": "::moo::", "host": "fake.host", "path": "/hello", "fragment": "frag" } } ] ``` **Remove tracking tuples from query:** ```text $ trurl "https://curl.se?search=hey&utm_source=tracker" --qtrim "utm_*" https://curl.se/?search=hey ``` **Show a specific query key value:** ```text $ trurl "https://example.com?a=home&here=now&thisthen" -g '{query:a}' home ``` **Sort the key/value pairs in the query component:** ```text $ trurl "https://example.com?b=a&c=b&a=c" --sort-query https://example.com?a=c&b=a&c=b ``` **Work with a query that uses a semicolon separator:** ```text $ trurl "https://curl.se?search=fool;page=5" --qtrim "search" --query-separator ";" https://curl.se?page=5 ``` **Accept spaces in the URL path:** ```text $ trurl "https://curl.se/this has space/index.html" --accept-space https://curl.se/this%20has%20space/index.html ``` ## Install ### Linux It is quite easy to compile the C source with GCC: ```text $ make cc -W -Wall -pedantic -g -c -o trurl.o trurl.c cc trurl.o -lcurl -o trurl ``` trurl is also available in [some package managers](https://github.com/curl/trurl/wiki/Get-trurl-for-your-OS). If it is not listed you can try searching for it using the package manager of your preferred distribution. ### Windows 1. Download and run [Cygwin installer.](https://www.cygwin.com/install.html) 2. Follow the instructions provided by the installer. When prompted to select packages, make sure to choose the following: curl, libcurl-devel, libcurl4, make and gcc-core. 3. (optional) Add the Cygwin bin directory to your system PATH variable. 4. Use `make`, just like on Linux. ## Prerequisites Development files of libcurl (e.g. `libcurl4-openssl-dev` or `libcurl4-gnutls-dev`) are needed for compilation. Requires libcurl version 7.62.0 or newer (the first libcurl to ship the URL parsing API). trurl also uses `CURLUPART_ZONEID` added in libcurl 7.81.0 and `curl_url_strerror()` added in libcurl 7.80.0 It would certainly be possible to make trurl work with older libcurl versions if someone wanted to. ### Older libcurls trurl builds with libcurl older than 7.81.0 but will then not work as good. For all the documented goodness, use a more modern libcurl. ### trurl / libcurl Compatibility | trurl Feature | Minimum libcurl Version | |-----------------|--------------------------| | imap-options | 7.30.0 | | normalize-ipv | 7.77.0 | | white-space | 7.78.0 | | url-strerror | 7.80.0 | | zone-id | 7.81.0 | | punycode | 7.88.0 | | punycode2idn | 8.3.0 | | no-guess-scheme | 8.9.0 | For more details on how trurl will behave if these features are missing see [URL Quirks](https://github.com/curl/trurl/blob/master/URL-QUIRKS.md). To see the features your version of trurl supports as well as the version of libcurl it is built with, run the following command: `trurl --version` trurl-0.16.1/RELEASE-NOTES0000664000000000000000000000070615010312005011556 0ustar00# trurl 0.16.1 ## Bugfixes - COPYING: add the "and many contributors" text from the curl license - scripts: import cd2nroff from curl - trurl: handle zero length query pairs - trurl.md: fix typo in --replace-append - Update README.md to link to the getting trurl wiki page - Autogenerate ZSH completions based on trurl.md - Makefile: only create MANDIR when manpage is installed Contributors to this release: Daniel Stenberg, Jacob Mealey, Sertonix trurl-0.16.1/RELEASE-PROCEDURE.md0000664000000000000000000000117015010312005012611 0ustar00trurl release procedure - how to do a release ============================================== in the source code repo ----------------------- - edit `RELEASE-NOTES` to be accurate - run `./scripts/mkrelease [version]` - make sure all relevant changes are committed on the master branch - tag the git repo in this style: `git tag -a trurl-[version]` -a annotates the tag - push the git commits and the new tag - Go to https://github.com/curl/trurl/tags and edit the tag as a release Consider allowing it to make a discussion post about it. celebrate --------- - suitable beverage intake is encouraged for the festivities trurl-0.16.1/THANKS0000664000000000000000000000101515010312005010572 0ustar00This project exists only thanks to the awesome people who make it happen. The following friends have contributed: Dan Fandrich Daniel Gustafsson Daniel Stenberg Ehsan Emanuele Torre Enno Tensing Gustavo Costa Håvard Bønes Jacob Mealey Jay Satiro Jeremy Lecour Krishean Draconis Luca Barbato ma Marian Posaceanu Martin Hauke Michael Ablassmeier Michael Lass Nekobit Nicolas CARPi Olaf Alders Pascal Knecht Paul Roub Paul Wise Renato Botelho Ruoyu Zhong Sajad F. Maghrebi Sevan Janiyan Viktor Szakats 積丹尼 Dan Jacobson trurl-0.16.1/URL-QUIRKS.md0000664000000000000000000000350315010312005011663 0ustar00# URL Quirks This is a collection of peculiarities you may find in trurl due to bugs or changes/improvements in libcurl's URL handling. ## The URL API Was introduced in libcurl 7.62.0. No older libcurl versions can be used. Build-time requirement. ## Extracting zone id Added in libcurl 7.65.0. The `CURLUE_NO_ZONEID` error code was added in 7.81.0. Build-time requirement. ## Normalizing IPv4 addresses Added in libcurl 7.77.0. Before that, the source formatting was kept. Run-time requirement. ## Allow space The libcurl URL parser was given the ability to allow spaces in libcurl 7.78.0. trurl therefore cannot offer this feature with older libcurl versions. Build-time requirement. ## `curl_url_strerror()` This API call was added in 7.80.0, using a libcurl version older than this will make trurl output less good error messages. Build-time requirement. ## Normalizing IPv6 addresses Implemented in libcurl 7.81.0. Before this, the source formatting was kept. Run-time requirement. ## `CURLU_PUNYCODE` Added in libcurl 7.88.0. Build-time requirement. ## Accepting % in host names The host name parser has been made stricter over time, with the most recent enhancement merged for libcurl 8.0.0. Run-time requirement. ## Parsing IPv6 literals when libcurl does not support IPv6 Before libcurl 8.0.0 the URL parser was not able to parse IPv6 addresses if libcurl itself was built without IPv6 capabilities. Run-time requirement. ## URL encoding of fragments This was a libcurl bug, fixed in libcurl 8.1.0 Run-time requirement. ## Bad IPv4 numerical address The normalization of IPv4 addresses would just ignore bad addresses, while newer libcurl versions will reject host names using invalid IPv4 addresses. Fixed in 8.1.0 Run-time requirement. ## Set illegal scheme Permitted before libcurl 8.1.0 Run-time requirement. trurl-0.16.1/completions/0000775000000000000000000000000015010312005012216 5ustar00trurl-0.16.1/completions/_trurl.zsh.in0000664000000000000000000000440015010312005014656 0ustar00#compdef trurl ########################################################################## # _ _ # Project | |_ _ __ _ _ _ __| | # | __| '__| | | | '__| | # | |_| | | |_| | | | | # \__|_| \__,_|_| |_| # # Copyright (C) Daniel Stenberg, , et al. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at https://curl.se/docs/copyright.html. # # You may opt to use, copy, modify, merge, publish, distribute and/or sell # copies of the Software, and permit persons to whom the Software is # furnished to do so, under the terms of the COPYING file. # # This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY # KIND, either express or implied. # # SPDX-License-Identifier: curl # ########################################################################## # This file is generated from trurls generate_completions.sh # standalone flags - things that have now follow on standalone_flags=(@TRURL_STANDALONE_FLAGS@) # component options - flags that expected to come after them component_options=(@TRURL_COMPONENT_OPTIONS@) # components that take *something* as a param but we can't # be sure what random_options=(@TRURL_RANDOM_OPTIONS@) # Components are specific URL parts that are only completed # after a component_options appears component_list=( @TRURL_COMPONENT_LIST@) if (( "${component_options[(Ie)${words[$CURRENT-1]}]}" )); then compadd -S "=" "${component_list[@]}" return 0 fi # if we expect another parameter that trurl doesn't define then # we should (i.e. a component) then fall back on ZSH _path_file if (( "${random_options[(Ie)${words[$CURRENT-1]}]}" )); then _path_files return 0 fi # calling compadd directly allows us the let the flags be # repeatable so we can recall --set, --get etc. repeatable=( "${component_options[@]}" "${random_options[@]}" ) args=( "${repeatable[@]}" ) # only apply single completions which haven't been used. for sf in "${standalone_flags[@]}"; do if ! (( "${words[(Ie)$sf]}" )); then args+=("$sf") fi done compadd "${args[@]}" trurl-0.16.1/scripts/0000775000000000000000000000000015010312005011351 5ustar00trurl-0.16.1/scripts/cd2nroff0000775000000000000000000003641515010312005013013 0ustar00#!/usr/bin/env perl #*************************************************************************** # _ _ ____ _ # Project ___| | | | _ \| | # / __| | | | |_) | | # | (__| |_| | _ <| |___ # \___|\___/|_| \_\_____| # # Copyright (C) Daniel Stenberg, , et al. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at https://curl.se/docs/copyright.html. # # You may opt to use, copy, modify, merge, publish, distribute and/or sell # copies of the Software, and permit persons to whom the Software is # furnished to do so, under the terms of the COPYING file. # # This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY # KIND, either express or implied. # # SPDX-License-Identifier: curl # ########################################################################### =begin comment Converts a curldown file to nroff (manpage). =end comment =cut use strict; use warnings; my $cd2nroff = "0.1"; # to keep check my $dir; my $extension; my $keepfilename; while(@ARGV) { if($ARGV[0] eq "-d") { shift @ARGV; $dir = shift @ARGV; } elsif($ARGV[0] eq "-e") { shift @ARGV; $extension = shift @ARGV; } elsif($ARGV[0] eq "-k") { shift @ARGV; $keepfilename = 1; } elsif($ARGV[0] eq "-h") { print < Write the output to the file name from the meta-data in the specified directory, instead of writing to stdout -e If -d is used, this option can provide an added "extension", arbitrary text really, to append to the file name. -h This help text, -v Show version then exit HELP ; exit 0; } elsif($ARGV[0] eq "-v") { print "cd2nroff version $cd2nroff\n"; exit 0; } else { last; } } use POSIX qw(strftime); my @ts; if (defined($ENV{SOURCE_DATE_EPOCH})) { @ts = gmtime($ENV{SOURCE_DATE_EPOCH}); } else { @ts = localtime; } my $date = strftime "%Y-%m-%d", @ts; sub outseealso { my (@sa) = @_; my $comma = 0; my @o; push @o, ".SH SEE ALSO\n"; for my $s (sort @sa) { push @o, sprintf "%s.BR $s", $comma ? ",\n": ""; $comma = 1; } push @o, "\n"; return @o; } sub outprotocols { my (@p) = @_; my $comma = 0; my @o; push @o, ".SH PROTOCOLS\n"; if($p[0] eq "TLS") { push @o, "This functionality affects all TLS based protocols: HTTPS, FTPS, IMAPS, POP3S, SMTPS etc."; } else { my @s = sort @p; push @o, "This functionality affects "; for my $e (sort @s) { push @o, sprintf "%s%s", $comma ? (($e eq $s[-1]) ? " and " : ", "): "", lc($e); $comma = 1; } if($#s == 0) { if($s[0] eq "All") { push @o, " supported protocols"; } else { push @o, " only"; } } } push @o, "\n"; return @o; } sub outtls { my (@t) = @_; my $comma = 0; my @o; if($t[0] eq "All") { push @o, "\nAll TLS backends support this option."; } elsif($t[0] eq "none") { push @o, "\nNo TLS backend supports this option."; } else { push @o, "\nThis option works only with the following TLS backends:\n"; my @s = sort @t; for my $e (@s) { push @o, sprintf "%s$e", $comma ? (($e eq $s[-1]) ? " and " : ", "): ""; $comma = 1; } } push @o, "\n"; return @o; } my %knownprotos = ( 'DICT' => 1, 'FILE' => 1, 'FTP' => 1, 'FTPS' => 1, 'GOPHER' => 1, 'GOPHERS' => 1, 'HTTP' => 1, 'HTTPS' => 1, 'IMAP' => 1, 'IMAPS' => 1, 'LDAP' => 1, 'LDAPS' => 1, 'MQTT' => 1, 'POP3' => 1, 'POP3S' => 1, 'RTMP' => 1, 'RTMPS' => 1, 'RTSP' => 1, 'SCP' => 1, 'SFTP' => 1, 'SMB' => 1, 'SMBS' => 1, 'SMTP' => 1, 'SMTPS' => 1, 'TELNET' => 1, 'TFTP' => 1, 'WS' => 1, 'WSS' => 1, 'TLS' => 1, 'TCP' => 1, 'QUIC' => 1, 'All' => 1 ); my %knowntls = ( 'BearSSL' => 1, 'GnuTLS' => 1, 'mbedTLS' => 1, 'OpenSSL' => 1, 'rustls' => 1, 'Schannel' => 1, 'Secure Transport' => 1, 'wolfSSL' => 1, 'All' => 1, 'none' => 1, ); sub single { my @seealso; my @proto; my @tls; my $d; my ($f)=@_; my $copyright; my $errors = 0; my $fh; my $line; my $list; my $tlslist; my $section; my $source; my $addedin; my $spdx; my $start = 0; my $title; if(defined($f)) { if(!open($fh, "<:crlf", "$f")) { print STDERR "cd2nroff failed to open '$f' for reading: $!\n"; return 1; } } else { $f = "STDIN"; $fh = \*STDIN; binmode($fh, ":crlf"); } while(<$fh>) { $line++; if(!$start) { if(/^---/) { # header starts here $start = 1; } next; } if(/^Title: *(.*)/i) { $title=$1; } elsif(/^Section: *(.*)/i) { $section=$1; } elsif(/^Source: *(.*)/i) { $source=$1; } elsif(/^See-also: +(.*)/i) { $list = 1; # 1 for see-also push @seealso, $1; } elsif(/^See-also: */i) { if($seealso[0]) { print STDERR "$f:$line:1:ERROR: bad See-Also, needs list\n"; return 2; } $list = 1; # 1 for see-also } elsif(/^Protocol:/i) { $list = 2; # 2 for protocol } elsif(/^TLS-backend:/i) { $list = 3; # 3 for TLS backend } elsif(/^Added-in: *(.*)/i) { $addedin=$1; if(($addedin !~ /^[0-9.]+[0-9]\z/) && ($addedin ne "n/a")) { print STDERR "$f:$line:1:ERROR: invalid version number in Added-in line: $addedin\n"; return 2; } } elsif(/^ +- (.*)/i) { # the only lists we support are see-also and protocol if($list == 1) { push @seealso, $1; } elsif($list == 2) { push @proto, $1; } elsif($list == 3) { push @tls, $1; } else { print STDERR "$f:$line:1:ERROR: list item without owner?\n"; return 2; } } # REUSE-IgnoreStart elsif(/^C: (.*)/i) { $copyright=$1; } elsif(/^SPDX-License-Identifier: (.*)/i) { $spdx=$1; } # REUSE-IgnoreEnd elsif(/^---/) { # end of the header section if(!$title) { print STDERR "$f:$line:1:ERROR: no 'Title:' in $f\n"; return 1; } if(!$section) { print STDERR "$f:$line:1:ERROR: no 'Section:' in $f\n"; return 2; } if(!$source) { print STDERR "$f:$line:1:ERROR: no 'Source:' in $f\n"; return 2; } if(($source eq "libcurl") && !$addedin) { print STDERR "$f:$line:1:ERROR: no 'Added-in:' in $f\n"; return 2; } if(!$seealso[0]) { print STDERR "$f:$line:1:ERROR: no 'See-also:' present\n"; return 2; } if(!$copyright) { print STDERR "$f:$line:1:ERROR: no 'C:' field present\n"; return 2; } if(!$spdx) { print STDERR "$f:$line:1:ERROR: no 'SPDX-License-Identifier:' field present\n"; return 2; } if($section == 3) { if(!$proto[0]) { printf STDERR "$f:$line:1:ERROR: missing Protocol:\n"; exit 2; } my $tls = 0; for my $p (@proto) { if($p eq "TLS") { $tls = 1; } if(!$knownprotos{$p}) { printf STDERR "$f:$line:1:ERROR: invalid protocol used: $p:\n"; exit 2; } } # This is for TLS, require TLS-backend: if($tls) { if(!$tls[0]) { printf STDERR "$f:$line:1:ERROR: missing TLS-backend:\n"; exit 2; } for my $t (@tls) { if(!$knowntls{$t}) { printf STDERR "$f:$line:1:ERROR: invalid TLS backend: $t:\n"; exit 2; } } } } last; } else { chomp; print STDERR "$f:$line:1:ERROR: unrecognized header keyword: '$_'\n"; $errors++; } } if(!$start) { print STDERR "$f:$line:1:ERROR: no header present\n"; return 2; } my @desc; my $quote = 0; my $blankline = 0; my $header = 0; # cut off the leading path from the file name, if any $f =~ s/^(.*[\\\/])//; push @desc, ".\\\" generated by cd2nroff $cd2nroff from $f\n"; push @desc, ".TH $title $section \"$date\" $source\n"; while(<$fh>) { $line++; $d = $_; if($quote) { if($quote == 4) { # remove the indentation if($d =~ /^ (.*)/) { push @desc, "$1\n"; next; } else { # end of quote $quote = 0; push @desc, ".fi\n"; next; } } if(/^~~~/) { # end of quote $quote = 0; push @desc, ".fi\n"; next; } # convert single backslahes to doubles $d =~ s/\\/\\\\/g; # lines starting with a period needs it escaped $d =~ s/^\./\\&./; push @desc, $d; next; } # remove single line HTML comments $d =~ s///g; # **bold** $d =~ s/\*\*(\S.*?)\*\*/\\fB$1\\fP/g; # *italics* $d =~ s/\*(\S.*?)\*/\\fI$1\\fP/g; my $back = $d; # remove all backticked pieces $back =~ s/\`(.*?)\`//g; if($back =~ /[^\\][\<\>]/) { print STDERR "$f:$line:1:ERROR: un-escaped < or > used\n"; $errors++; } # convert backslash-'<' or '> to just the second character $d =~ s/\\([<>])/$1/g; # mentions of curl symbols with manpages use italics by default $d =~ s/((lib|)curl([^ ]*\(3\)))/\\fI$1\\fP/gi; # backticked becomes italics $d =~ s/\`(.*?)\`/\\fI$1\\fP/g; if(/^## (.*)/) { my $word = $1; # if there are enclosing quotes, remove them first $word =~ s/[\"\'\`](.*)[\"\'\`]\z/$1/; # enclose in double quotes if there is a space present if($word =~ / /) { push @desc, ".IP \"$word\"\n"; } else { push @desc, ".IP $word\n"; } $header = 1; } elsif(/^##/) { # end of IP sequence push @desc, ".PP\n"; $header = 1; } elsif(/^# (.*)/) { my $word = $1; # if there are enclosing quotes, remove them first $word =~ s/[\"\'](.*)[\"\']\z/$1/; if($word eq "PROTOCOLS") { print STDERR "$f:$line:1:WARN: PROTOCOLS section in source file\n"; } elsif($word eq "AVAILABILITY") { print STDERR "$f:$line:1:WARN: AVAILABILITY section in source file\n"; } elsif($word eq "%PROTOCOLS%") { # insert the generated PROTOCOLS section push @desc, outprotocols(@proto); if($proto[0] eq "TLS") { push @desc, outtls(@tls); } $header = 1; next; } elsif($word eq "%AVAILABILITY%") { if($addedin ne "n/a") { # insert the generated AVAILABILITY section push @desc, ".SH AVAILABILITY\n"; push @desc, "Added in curl $addedin\n"; } $header = 1; next; } push @desc, ".SH $word\n"; $header = 1; } elsif(/^~~~c/) { # start of a code section, not indented $quote = 1; push @desc, "\n" if($blankline && !$header); $header = 0; push @desc, ".nf\n"; } elsif(/^~~~/) { # start of a quote section; not code, not indented $quote = 1; push @desc, "\n" if($blankline && !$header); $header = 0; push @desc, ".nf\n"; } elsif(/^ (.*)/) { # quoted, indented by 4 space $quote = 4; push @desc, "\n" if($blankline && !$header); $header = 0; push @desc, ".nf\n$1\n"; } elsif(/^[ \t]*\n/) { # count and ignore blank lines $blankline++; } else { # don't output newlines if this is the first content after a # header push @desc, "\n" if($blankline && !$header); $blankline = 0; $header = 0; # quote minuses in the output $d =~ s/([^\\])-/$1\\-/g; # replace single quotes $d =~ s/\'/\\(aq/g; # handle double quotes first on the line $d =~ s/^(\s*)\"/$1\\&\"/; # lines starting with a period needs it escaped $d =~ s/^\./\\&./; if($d =~ /^(.*) /) { printf STDERR "$f:$line:%d:ERROR: 2 spaces detected\n", length($1); $errors++; } if($d =~ /^[ \t]*\n/) { # replaced away all contents $blankline= 1; } else { push @desc, $d; } } } if($fh != \*STDIN) { close($fh); } push @desc, outseealso(@seealso); if($dir) { if($keepfilename) { $title = $f; $title =~ s/\.[^.]*$//; } my $outfile = "$dir/$title.$section"; if(defined($extension)) { $outfile .= $extension; } if(!open(O, ">", $outfile)) { print STDERR "Failed to open $outfile : $!\n"; return 1; } print O @desc; close(O); } else { print @desc; } return $errors; } if(@ARGV) { for my $f (@ARGV) { my $r = single($f); if($r) { exit $r; } } } else { exit single(); } trurl-0.16.1/scripts/checksrc.pl0000775000000000000000000007744615010312005013520 0ustar00#!/usr/bin/env perl #*************************************************************************** # _ _ ____ _ # Project ___| | | | _ \| | # / __| | | | |_) | | # | (__| |_| | _ <| |___ # \___|\___/|_| \_\_____| # # Copyright (C) Daniel Stenberg, , et al. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at https://curl.se/docs/copyright.html. # # You may opt to use, copy, modify, merge, publish, distribute and/or sell # copies of the Software, and permit persons to whom the Software is # furnished to do so, under the terms of the COPYING file. # # This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY # KIND, either express or implied. # # SPDX-License-Identifier: curl # ########################################################################### use strict; use warnings; my $max_column = 79; my $indent = 2; my $warnings = 0; my $swarnings = 0; my $errors = 0; my $serrors = 0; my $suppressed; # skipped problems my $file; my $dir="."; my $wlist=""; my @alist; my $windows_os = $^O eq 'MSWin32' || $^O eq 'cygwin' || $^O eq 'msys'; my $verbose; my %skiplist; my %ignore; my %ignore_set; my %ignore_used; my @ignore_line; my %warnings_extended = ( 'COPYRIGHTYEAR' => 'copyright year incorrect', 'STRERROR', => 'strerror() detected', ); my %warnings = ( 'ASSIGNWITHINCONDITION' => 'assignment within conditional expression', 'ASTERISKNOSPACE' => 'pointer declared without space before asterisk', 'ASTERISKSPACE' => 'pointer declared with space after asterisk', 'BADCOMMAND' => 'bad !checksrc! instruction', 'BANNEDFUNC' => 'a banned function was used', 'BRACEELSE' => '} else on the same line', 'BRACEPOS' => 'wrong position for an open brace', 'BRACEWHILE' => 'A single space between open brace and while', 'COMMANOSPACE' => 'comma without following space', 'COMMENTNOSPACEEND' => 'no space before */', 'COMMENTNOSPACESTART' => 'no space following /*', 'COPYRIGHT' => 'file missing a copyright statement', 'CPPCOMMENTS' => '// comment detected', 'DOBRACE' => 'A single space between do and open brace', 'EMPTYLINEBRACE' => 'Empty line before the open brace', 'EQUALSNOSPACE' => 'equals sign without following space', 'EQUALSNULL' => 'if/while comparison with == NULL', 'EXCLAMATIONSPACE' => 'Whitespace after exclamation mark in expression', 'FOPENMODE' => 'fopen needs a macro for the mode string', 'INCLUDEDUP', => 'same file is included again', 'INDENTATION' => 'wrong start column for code', 'LONGLINE' => "Line longer than $max_column", 'MULTISPACE' => 'multiple spaces used when not suitable', 'NOSPACEEQUALS' => 'equals sign without preceding space', 'NOTEQUALSZERO', => 'if/while comparison with != 0', 'ONELINECONDITION' => 'conditional block on the same line as the if()', 'OPENCOMMENT' => 'file ended with a /* comment still "open"', 'PARENBRACE' => '){ without sufficient space', 'RETURNNOSPACE' => 'return without space', 'SEMINOSPACE' => 'semicolon without following space', 'SIZEOFNOPAREN' => 'use of sizeof without parentheses', 'SNPRINTF' => 'use of snprintf', 'SPACEAFTERPAREN' => 'space after open parenthesis', 'SPACEBEFORECLOSE' => 'space before a close parenthesis', 'SPACEBEFORECOMMA' => 'space before a comma', 'SPACEBEFOREPAREN' => 'space before an open parenthesis', 'SPACESEMICOLON' => 'space before semicolon', 'SPACESWITCHCOLON' => 'space before colon of switch label', 'TABS' => 'TAB characters not allowed', 'TRAILINGSPACE' => 'Trailing whitespace on the line', 'TYPEDEFSTRUCT' => 'typedefed struct', 'UNUSEDIGNORE' => 'a warning ignore was not used', ); sub readskiplist { open(my $W, '<', "$dir/checksrc.skip") or return; my @all=<$W>; for(@all) { $windows_os ? $_ =~ s/\r?\n$// : chomp; $skiplist{$_}=1; } close($W); } # Reads the .checksrc in $dir for any extended warnings to enable locally. # Currently there is no support for disabling warnings from the standard set, # and since that's already handled via !checksrc! commands there is probably # little use to add it. sub readlocalfile { my $i = 0; open(my $rcfile, "<", "$dir/.checksrc") or return; while(<$rcfile>) { $i++; # Lines starting with '#' are considered comments if (/^\s*(#.*)/) { next; } elsif (/^\s*enable ([A-Z]+)$/) { if(!defined($warnings_extended{$1})) { print STDERR "invalid warning specified in .checksrc: \"$1\"\n"; next; } $warnings{$1} = $warnings_extended{$1}; } elsif (/^\s*disable ([A-Z]+)$/) { if(!defined($warnings{$1})) { print STDERR "invalid warning specified in .checksrc: \"$1\"\n"; next; } # Accept-list push @alist, $1; } else { die "Invalid format in $dir/.checksrc on line $i\n"; } } close($rcfile); } sub checkwarn { my ($name, $num, $col, $file, $line, $msg, $error) = @_; my $w=$error?"error":"warning"; my $nowarn=0; #if(!$warnings{$name}) { # print STDERR "Dev! there's no description for $name!\n"; #} # checksrc.skip if($skiplist{$line}) { $nowarn = 1; } # !checksrc! controlled elsif($ignore{$name}) { $ignore{$name}--; $ignore_used{$name}++; $nowarn = 1; if(!$ignore{$name}) { # reached zero, enable again enable_warn($name, $num, $file, $line); } } if($nowarn) { $suppressed++; if($w) { $swarnings++; } else { $serrors++; } return; } if($w) { $warnings++; } else { $errors++; } $col++; print "$file:$num:$col: $w: $msg ($name)\n"; print " $line\n"; if($col < 80) { my $pref = (' ' x $col); print "${pref}^\n"; } } $file = shift @ARGV; while(defined $file) { if($file =~ /-D(.*)/) { $dir = $1; $file = shift @ARGV; next; } elsif($file =~ /-W(.*)/) { $wlist .= " $1 "; $file = shift @ARGV; next; } elsif($file =~ /-A(.+)/) { push @alist, $1; $file = shift @ARGV; next; } elsif($file =~ /-i([1-9])/) { $indent = $1 + 0; $file = shift @ARGV; next; } elsif($file =~ /-m([0-9]+)/) { $max_column = $1 + 0; $file = shift @ARGV; next; } elsif($file =~ /^(-h|--help)/) { undef $file; last; } last; } if(!$file) { print "checksrc.pl [option] [file2] ...\n"; print " Options:\n"; print " -A[rule] Accept this violation, can be used multiple times\n"; print " -D[DIR] Directory to prepend file names\n"; print " -h Show help output\n"; print " -W[file] Skip the given file - ignore all its flaws\n"; print " -i Indent spaces. Default: 2\n"; print " -m Maximum line length. Default: 79\n"; print "\nDetects and warns for these problems:\n"; my @allw = keys %warnings; push @allw, keys %warnings_extended; for my $w (sort @allw) { if($warnings{$w}) { printf (" %-18s: %s\n", $w, $warnings{$w}); } else { printf (" %-18s: %s[*]\n", $w, $warnings_extended{$w}); } } print " [*] = disabled by default\n"; exit; } readskiplist(); readlocalfile(); do { if("$wlist" !~ / $file /) { my $fullname = $file; $fullname = "$dir/$file" if ($fullname !~ '^\.?\.?/'); scanfile($fullname); } $file = shift @ARGV; } while($file); sub accept_violations { for my $r (@alist) { if(!$warnings{$r}) { print "'$r' is not a warning to accept!\n"; exit; } $ignore{$r}=999999; $ignore_used{$r}=0; } } sub checksrc_clear { undef %ignore; undef %ignore_set; undef @ignore_line; } sub checksrc_endoffile { my ($file) = @_; for(keys %ignore_set) { if($ignore_set{$_} && !$ignore_used{$_}) { checkwarn("UNUSEDIGNORE", $ignore_set{$_}, length($_)+11, $file, $ignore_line[$ignore_set{$_}], "Unused ignore: $_"); } } } sub enable_warn { my ($what, $line, $file, $l) = @_; # switch it back on, but warn if not triggered! if(!$ignore_used{$what}) { checkwarn("UNUSEDIGNORE", $line, length($what) + 11, $file, $l, "No warning was inhibited!"); } $ignore_set{$what}=0; $ignore_used{$what}=0; $ignore{$what}=0; } sub checksrc { my ($cmd, $line, $file, $l) = @_; if($cmd =~ / *([^ ]*) *(.*)/) { my ($enable, $what) = ($1, $2); $what =~ s: *\*/$::; # cut off end of C comment # print "ENABLE $enable WHAT $what\n"; if($enable eq "disable") { my ($warn, $scope)=($1, $2); if($what =~ /([^ ]*) +(.*)/) { ($warn, $scope)=($1, $2); } else { $warn = $what; $scope = 1; } # print "IGNORE $warn for SCOPE $scope\n"; if($scope eq "all") { $scope=999999; } # Comparing for a literal zero rather than the scalar value zero # covers the case where $scope contains the ending '*' from the # comment. If we use a scalar comparison (==) we induce warnings # on non-scalar contents. if($scope eq "0") { checkwarn("BADCOMMAND", $line, 0, $file, $l, "Disable zero not supported, did you mean to enable?"); } elsif($ignore_set{$warn}) { checkwarn("BADCOMMAND", $line, 0, $file, $l, "$warn already disabled from line $ignore_set{$warn}"); } else { $ignore{$warn}=$scope; $ignore_set{$warn}=$line; $ignore_line[$line]=$l; } } elsif($enable eq "enable") { enable_warn($what, $line, $file, $l); } else { checkwarn("BADCOMMAND", $line, 0, $file, $l, "Illegal !checksrc! command"); } } } sub nostrings { my ($str) = @_; $str =~ s/\".*\"//g; return $str; } sub scanfile { my ($file) = @_; my $line = 1; my $prevl=""; my $prevpl=""; my $l = ""; my $prep = 0; my $prevp = 0; open(my $R, '<', $file) || die "failed to open $file"; my $incomment=0; my @copyright=(); my %includes; checksrc_clear(); # for file based ignores accept_violations(); while(<$R>) { $windows_os ? $_ =~ s/\r?\n$// : chomp; my $l = $_; my $ol = $l; # keep the unmodified line for error reporting my $column = 0; # check for !checksrc! commands if($l =~ /\!checksrc\! (.*)/) { my $cmd = $1; checksrc($cmd, $line, $file, $l) } # check for a copyright statement and save the years if($l =~ /\* +copyright .* (\d\d\d\d|)/i) { my $count = 0; while($l =~ /([\d]{4})/g) { push @copyright, { year => $1, line => $line, col => index($l, $1), code => $l }; $count++; } if(!$count) { # year-less push @copyright, { year => -1, line => $line, col => index($l, $1), code => $l }; } } # detect long lines if(length($l) > $max_column) { checkwarn("LONGLINE", $line, length($l), $file, $l, "Longer than $max_column columns"); } # detect TAB characters if($l =~ /^(.*)\t/) { checkwarn("TABS", $line, length($1), $file, $l, "Contains TAB character", 1); } # detect trailing whitespace if($l =~ /^(.*)[ \t]+\z/) { checkwarn("TRAILINGSPACE", $line, length($1), $file, $l, "Trailing whitespace"); } # no space after comment start if($l =~ /^(.*)\/\*\w/) { checkwarn("COMMENTNOSPACESTART", $line, length($1) + 2, $file, $l, "Missing space after comment start"); } # no space at comment end if($l =~ /^(.*)\w\*\//) { checkwarn("COMMENTNOSPACEEND", $line, length($1) + 1, $file, $l, "Missing space end comment end"); } # ------------------------------------------------------------ # Above this marker, the checks were done on lines *including* # comments # ------------------------------------------------------------ # strip off C89 comments comment: if(!$incomment) { if($l =~ s/\/\*.*\*\// /g) { # full /* comments */ were removed! } if($l =~ s/\/\*.*//) { # start of /* comment was removed $incomment = 1; } } else { if($l =~ s/.*\*\///) { # end of comment */ was removed $incomment = 0; goto comment; } else { # still within a comment $l=""; } } # ------------------------------------------------------------ # Below this marker, the checks were done on lines *without* # comments # ------------------------------------------------------------ # prev line was a preprocessor **and** ended with a backslash if($prep && ($prevpl =~ /\\ *\z/)) { # this is still a preprocessor line $prep = 1; goto preproc; } $prep = 0; # crude attempt to detect // comments without too many false # positives if($l =~ /^(([^"\*]*)[^:"]|)\/\//) { checkwarn("CPPCOMMENTS", $line, length($1), $file, $l, "\/\/ comment"); } if($l =~ /^(\#\s*include\s+)([\">].*[>}"])/) { my ($pre, $path) = ($1, $2); if($includes{$path}) { checkwarn("INCLUDEDUP", $line, length($1), $file, $l, "duplicated include"); } $includes{$path} = $l; } # detect and strip preprocessor directives if($l =~ /^[ \t]*\#/) { # preprocessor line $prep = 1; goto preproc; } my $nostr = nostrings($l); # check spaces after for/if/while/function call if($nostr =~ /^(.*)(for|if|while|switch| ([a-zA-Z0-9_]+)) \((.)/) { my ($leading, $word, $extra, $first)=($1,$2,$3,$4); if($1 =~ / *\#/) { # this is a #if, treat it differently } elsif(defined $3 && $3 eq "return") { # return must have a space } elsif(defined $3 && $3 eq "case") { # case must have a space } elsif(($first eq "*") && ($word !~ /(for|if|while|switch)/)) { # A "(*" beginning makes the space OK because it wants to # allow function pointer declared } elsif($1 =~ / *typedef/) { # typedefs can use space-paren } else { checkwarn("SPACEBEFOREPAREN", $line, length($leading)+length($word), $file, $l, "$word with space"); } } # check for '== NULL' in if/while conditions but not if the thing on # the left of it is a function call if($nostr =~ /^(.*)(if|while)(\(.*?)([!=]= NULL|NULL [!=]=)/) { checkwarn("EQUALSNULL", $line, length($1) + length($2) + length($3), $file, $l, "we prefer !variable instead of \"== NULL\" comparisons"); } # check for '!= 0' in if/while conditions but not if the thing on # the left of it is a function call if($nostr =~ /^(.*)(if|while)(\(.*[^)]) != 0[^x]/) { checkwarn("NOTEQUALSZERO", $line, length($1) + length($2) + length($3), $file, $l, "we prefer if(rc) instead of \"rc != 0\" comparisons"); } # check spaces in 'do {' if($nostr =~ /^( *)do( *)\{/ && length($2) != 1) { checkwarn("DOBRACE", $line, length($1) + 2, $file, $l, "one space after do before brace"); } # check spaces in 'do {' elsif($nostr =~ /^( *)\}( *)while/ && length($2) != 1) { checkwarn("BRACEWHILE", $line, length($1) + 2, $file, $l, "one space between brace and while"); } if($nostr =~ /^((.*\s)(if) *\()(.*)\)(.*)/) { my $pos = length($1); my $postparen = $5; my $cond = $4; if($cond =~ / = /) { checkwarn("ASSIGNWITHINCONDITION", $line, $pos+1, $file, $l, "assignment within conditional expression"); } my $temp = $cond; $temp =~ s/\(//g; # remove open parens my $openc = length($cond) - length($temp); $temp = $cond; $temp =~ s/\)//g; # remove close parens my $closec = length($cond) - length($temp); my $even = $openc == $closec; if($l =~ / *\#/) { # this is a #if, treat it differently } elsif($even && $postparen && ($postparen !~ /^ *$/) && ($postparen !~ /^ *[,{&|\\]+/)) { checkwarn("ONELINECONDITION", $line, length($l)-length($postparen), $file, $l, "conditional block on the same line"); } } # check spaces after open parentheses if($l =~ /^(.*[a-z])\( /i) { checkwarn("SPACEAFTERPAREN", $line, length($1)+1, $file, $l, "space after open parenthesis"); } # check spaces before close parentheses, unless it was a space or a # close parenthesis! if($l =~ /(.*[^\) ]) \)/) { checkwarn("SPACEBEFORECLOSE", $line, length($1)+1, $file, $l, "space before close parenthesis"); } # check spaces before comma! if($l =~ /(.*[^ ]) ,/) { checkwarn("SPACEBEFORECOMMA", $line, length($1)+1, $file, $l, "space before comma"); } # check for "return(" without space if($l =~ /^(.*)return\(/) { if($1 =~ / *\#/) { # this is a #if, treat it differently } else { checkwarn("RETURNNOSPACE", $line, length($1)+6, $file, $l, "return without space before paren"); } } # check for "sizeof" without parenthesis if(($l =~ /^(.*)sizeof *([ (])/) && ($2 ne "(")) { if($1 =~ / *\#/) { # this is a #if, treat it differently } else { checkwarn("SIZEOFNOPAREN", $line, length($1)+6, $file, $l, "sizeof without parenthesis"); } } # check for comma without space if($l =~ /^(.*),[^ \n]/) { my $pref=$1; my $ign=0; if($pref =~ / *\#/) { # this is a #if, treat it differently $ign=1; } elsif($pref =~ /\/\*/) { # this is a comment $ign=1; } elsif($pref =~ /[\"\']/) { $ign = 1; # There is a quote here, figure out whether the comma is # within a string or '' or not. if($pref =~ /\"/) { # within a string } elsif($pref =~ /\'$/) { # a single letter } else { $ign = 0; } } if(!$ign) { checkwarn("COMMANOSPACE", $line, length($pref)+1, $file, $l, "comma without following space"); } } # check for "} else" if($l =~ /^(.*)\} *else/) { checkwarn("BRACEELSE", $line, length($1), $file, $l, "else after closing brace on same line"); } # check for "){" if($l =~ /^(.*)\)\{/) { checkwarn("PARENBRACE", $line, length($1)+1, $file, $l, "missing space after close paren"); } # check for "^{" with an empty line before it if(($l =~ /^\{/) && ($prevl =~ /^[ \t]*\z/)) { checkwarn("EMPTYLINEBRACE", $line, 0, $file, $l, "empty line before open brace"); } # check for space before the semicolon last in a line if($l =~ /^(.*[^ ].*) ;$/) { checkwarn("SPACESEMICOLON", $line, length($1), $file, $ol, "no space before semicolon"); } # check for space before the colon in a switch label if($l =~ /^( *(case .+|default)) :/) { checkwarn("SPACESWITCHCOLON", $line, length($1), $file, $ol, "no space before colon of switch label"); } # scan for use of banned functions if($l =~ /^(.*\W) (gmtime|localtime| gets| strtok| v?sprintf| (str|_mbs|_tcs|_wcs)n?cat| LoadLibrary(Ex)?(A|W)?) \s*\( /x) { checkwarn("BANNEDFUNC", $line, length($1), $file, $ol, "use of $2 is banned"); } if($warnings{"STRERROR"}) { # scan for use of banned strerror. This is not a BANNEDFUNC to # allow for individual enable/disable of this warning. if($l =~ /^(.*\W)(strerror)\s*\(/x) { if($1 !~ /^ *\#/) { # skip preprocessor lines checkwarn("STRERROR", $line, length($1), $file, $ol, "use of $2 is banned"); } } } # scan for use of snprintf for curl-internals reasons if($l =~ /^(.*\W)(v?snprintf)\s*\(/x) { checkwarn("SNPRINTF", $line, length($1), $file, $ol, "use of $2 is banned"); } # scan for use of non-binary fopen without the macro if($l =~ /^(.*\W)fopen\s*\([^,]*, *\"([^"]*)/) { my $mode = $2; if($mode !~ /b/) { checkwarn("FOPENMODE", $line, length($1), $file, $ol, "use of non-binary fopen without FOPEN_* macro: $mode"); } } # check for open brace first on line but not first column only alert # if previous line ended with a close paren and it wasn't a cpp line if(($prevl =~ /\)\z/) && ($l =~ /^( +)\{/) && !$prevp) { checkwarn("BRACEPOS", $line, length($1), $file, $ol, "badly placed open brace"); } # if the previous line starts with if/while/for AND ends with an open # brace, or an else statement, check that this line is indented $indent # more steps, if not a cpp line if(!$prevp && ($prevl =~ /^( *)((if|while|for)\(.*\{|else)\z/)) { my $first = length($1); # this line has some character besides spaces if($l =~ /^( *)[^ ]/) { my $second = length($1); my $expect = $first+$indent; if($expect != $second) { my $diff = $second - $first; checkwarn("INDENTATION", $line, length($1), $file, $ol, "not indented $indent steps (uses $diff)"); } } } # if the previous line starts with if/while/for AND ends with a closed # parenthesis and there's an equal number of open and closed # parentheses, check that this line is indented $indent more steps, if # not a cpp line elsif(!$prevp && ($prevl =~ /^( *)(if|while|for)(\(.*\))\z/)) { my $first = length($1); my $op = $3; my $cl = $3; $op =~ s/[^(]//g; $cl =~ s/[^)]//g; if(length($op) == length($cl)) { # this line has some character besides spaces if($l =~ /^( *)[^ ]/) { my $second = length($1); my $expect = $first+$indent; if($expect != $second) { my $diff = $second - $first; checkwarn("INDENTATION", $line, length($1), $file, $ol, "not indented $indent steps (uses $diff)"); } } } } # check for 'char * name' if(($l =~ /(^.*(char|int|long|void|CURL|CURLM|CURLMsg|[cC]url_[A-Za-z_]+|struct [a-zA-Z_]+) *(\*+)) (\w+)/) && ($4 !~ /^(const|volatile)$/)) { checkwarn("ASTERISKSPACE", $line, length($1), $file, $ol, "space after declarative asterisk"); } # check for 'char*' if(($l =~ /(^.*(char|int|long|void|curl_slist|CURL|CURLM|CURLMsg|curl_httppost|sockaddr_in|FILE)\*)/)) { checkwarn("ASTERISKNOSPACE", $line, length($1)-1, $file, $ol, "no space before asterisk"); } # check for 'void func() {', but avoid false positives by requiring # both an open and closed parentheses before the open brace if($l =~ /^((\w).*)\{\z/) { my $k = $1; $k =~ s/const *//; $k =~ s/static *//; if($k =~ /\(.*\)/) { checkwarn("BRACEPOS", $line, length($l)-1, $file, $ol, "wrongly placed open brace"); } } # check for equals sign without spaces next to it if($nostr =~ /(.*)\=[a-z0-9]/i) { checkwarn("EQUALSNOSPACE", $line, length($1)+1, $file, $ol, "no space after equals sign"); } # check for equals sign without spaces before it elsif($nostr =~ /(.*)[a-z0-9]\=/i) { checkwarn("NOSPACEEQUALS", $line, length($1)+1, $file, $ol, "no space before equals sign"); } # check for plus signs without spaces next to it if($nostr =~ /(.*)[^+]\+[a-z0-9]/i) { checkwarn("PLUSNOSPACE", $line, length($1)+1, $file, $ol, "no space after plus sign"); } # check for plus sign without spaces before it elsif($nostr =~ /(.*)[a-z0-9]\+[^+]/i) { checkwarn("NOSPACEPLUS", $line, length($1)+1, $file, $ol, "no space before plus sign"); } # check for semicolons without space next to it if($nostr =~ /(.*)\;[a-z0-9]/i) { checkwarn("SEMINOSPACE", $line, length($1)+1, $file, $ol, "no space after semicolon"); } # typedef struct ... { if($nostr =~ /^(.*)typedef struct.*{/) { checkwarn("TYPEDEFSTRUCT", $line, length($1)+1, $file, $ol, "typedef'ed struct"); } if($nostr =~ /(.*)! +(\w|\()/) { checkwarn("EXCLAMATIONSPACE", $line, length($1)+1, $file, $ol, "space after exclamation mark"); } # check for more than one consecutive space before open brace or # question mark. Skip lines containing strings since they make it hard # due to artificially getting multiple spaces if(($l eq $nostr) && $nostr =~ /^(.*(\S)) + [{?]/i) { checkwarn("MULTISPACE", $line, length($1)+1, $file, $ol, "multiple spaces"); } preproc: $line++; $prevp = $prep; $prevl = $ol if(!$prep); $prevpl = $ol if($prep); } if(!scalar(@copyright)) { checkwarn("COPYRIGHT", 1, 0, $file, "", "Missing copyright statement", 1); } # COPYRIGHTYEAR is an extended warning so we must first see if it has been # enabled in .checksrc if(defined($warnings{"COPYRIGHTYEAR"})) { # The check for updated copyrightyear is overly complicated in order to # not punish current hacking for past sins. The copyright years are # right now a bit behind, so enforcing copyright year checking on all # files would cause hundreds of errors. Instead we only look at files # which are tracked in the Git repo and edited in the workdir, or # committed locally on the branch without being in upstream master. # # The simple and naive test is to simply check for the current year, # but updating the year even without an edit is against project policy # (and it would fail every file on January 1st). # # A rather more interesting, and correct, check would be to not test # only locally committed files but inspect all files wrt the year of # their last commit. Removing the `git rev-list origin/master..HEAD` # condition below will enforce copyright year checks against the year # the file was last committed (and thus edited to some degree). my $commityear = undef; @copyright = sort {$$b{year} cmp $$a{year}} @copyright; # if the file is modified, assume commit year this year if(`git status -s -- $file` =~ /^ [MARCU]/) { $commityear = (localtime(time))[5] + 1900; } else { # min-parents=1 to ignore wrong initial commit in truncated repos my $grl = `git rev-list --max-count=1 --min-parents=1 --timestamp HEAD -- $file`; if($grl) { chomp $grl; $commityear = (localtime((split(/ /, $grl))[0]))[5] + 1900; } } if(defined($commityear) && scalar(@copyright) && $copyright[0]{year} != $commityear) { checkwarn("COPYRIGHTYEAR", $copyright[0]{line}, $copyright[0]{col}, $file, $copyright[0]{code}, "Copyright year out of date, should be $commityear, " . "is $copyright[0]{year}", 1); } } if($incomment) { checkwarn("OPENCOMMENT", 1, 0, $file, "", "Missing closing comment", 1); } checksrc_endoffile($file); close($R); } if($errors || $warnings || $verbose) { printf "checksrc: %d errors and %d warnings\n", $errors, $warnings; if($suppressed) { printf "checksrc: %d errors and %d warnings suppressed\n", $serrors, $swarnings; } exit 5; # return failure } trurl-0.16.1/scripts/generate_completions.sh0000775000000000000000000000513315010312005016120 0ustar00#!/bin/bash ########################################################################## # _ _ # Project | |_ _ __ _ _ _ __| | # | __| '__| | | | '__| | # | |_| | | |_| | | | | # \__|_| \__,_|_| |_| # # Copyright (C) Daniel Stenberg, , et al. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at https://curl.se/docs/copyright.html. # # You may opt to use, copy, modify, merge, publish, distribute and/or sell # copies of the Software, and permit persons to whom the Software is # furnished to do so, under the terms of the COPYING file. # # This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY # KIND, either express or implied. # # SPDX-License-Identifier: curl # ########################################################################## if [ -z "$1" ]; then echo "expected a trurl.md file to be passed in..." exit 1 fi TRURL_MD_FILE=$1 ALL_FLAGS="$(sed -n \ -e 's/"//g' \ -e '/\# URL COMPONENTS/q;p' \ < "${TRURL_MD_FILE}" \ | grep "##" \ | awk '{printf "%s%s%s%s ", $2, $3, $4, $5}')" TRURL_COMPONENT_OPTIONS="" TRURL_STANDALONE_FLAGS="" TRURL_RANDOM_OPTIONS="" TRURL_COMPONENT_LIST="$(sed -n \ -e 's/"//g' \ -e '1,/\# URL COMPONENTS/ d' \ -e '/\# JSON output format/q;p' \ < "${TRURL_MD_FILE}" \ | grep "##" \ | awk '{printf "\"%s\" ", $2}')" for flag in $ALL_FLAGS; do # these are now TRURL_STANDALONE if echo "$flag" | grep -q "="; then TRURL_COMPONENT_OPTIONS+="$(echo "$flag" \ | awk '{split($0, a, ","); for(i in a) {printf "%s ", a[i]}}' \ | cut -f1 -d '[' \ | awk '{printf "\"%s\" ", $1}')" elif echo "$flag" | grep -q "\["; then TRURL_RANDOM_OPTIONS+="$(echo "$flag" \ | awk '{split($0, a, ","); for(i in a) {printf "%s ", a[i]}}' \ | cut -f1 -d '[' \ | awk '{printf "\"%s\" ", $1}')" else TRURL_STANDALONE_FLAGS+="$(echo "$flag" \ | awk '{split($0, a, ","); for(i in a) {printf "\"%s\" ", a[i]}}')" fi done function generate_zsh() { sed -e "s/@TRURL_RANDOM_OPTIONS@/${TRURL_RANDOM_OPTIONS}/g" \ -e "s/@TRURL_STANDALONE_FLAGS@/${TRURL_STANDALONE_FLAGS}/g" \ -e "s/@TRURL_COMPONENT_OPTIONS@/${TRURL_COMPONENT_OPTIONS}/g" \ -e "s/@TRURL_COMPONENT_LIST@/${TRURL_COMPONENT_LIST}/g" \ ./completions/_trurl.zsh.in > ./completions/_trurl.zsh } generate_zsh "$TRURL_RANDOM_OPTIONS" trurl-0.16.1/scripts/mkrelease0000775000000000000000000000445215010312005013254 0ustar00#!/bin/sh ########################################################################## # _ _ ____ _ # Project ___| | | | _ \| | # / __| | | | |_) | | # | (__| |_| | _ <| |___ # \___|\___/|_| \_\_____| # # Copyright (C) Daniel Stenberg, , et al. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at https://curl.se/docs/copyright.html. # # You may opt to use, copy, modify, merge, publish, distribute and/or sell # copies of the Software, and permit persons to whom the Software is # furnished to do so, under the terms of the COPYING file. # # This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY # KIND, either express or implied. # # SPDX-License-Identifier: curl # ########################################################################## set -eu export LC_ALL=C export TZ=UTC version="${1:-}" if [ -z "$version" ]; then echo "Specify a version number!" exit fi rel="trurl-$version" mkdir $rel # update title in markdown manpage sed -ie "s/^Source: trurl \([0-9.]*\)/Source: trurl $version/" trurl.md # update version number in header file sed -ie "s/\"[\.0-9]*\"/\"$version\"/" version.h # render the manpage into nroff ./scripts/cd2nroff trurl.md > $rel/trurl.1 # create a release directory tree cp -p --parents $(git ls-files | grep -vE '^(.github/|.reuse/|.gitignore|LICENSES/)') $rel # create tarball from the tree targz="$rel.tar.gz" tar cfz "$targz" "$rel" timestamp=${SOURCE_DATE_EPOCH:-$(date +"%s")} filestamp=$(date -d "@$timestamp" +"%Y%m%d%H%M.%S") retar() { tempdir=$1 rm -rf "$tempdir" mkdir "$tempdir" cd "$tempdir" gzip -dc "../$targz" | tar -xf - find trurl-* -depth -exec touch -c -t "$filestamp" '{}' + tar --create --format=ustar --owner=0 --group=0 --numeric-owner --sort=name trurl-* | gzip --best --no-name > out.tar.gz mv out.tar.gz ../ cd .. rm -rf "$tempdir" } # make it reproducible retar ".tarbuild" mv out.tar.gz "$targz" # remove the temporary directory rm -rf $rel # Set deterministic timestamp touch -c -t "$filestamp" "$targz" echo "Now sign the release:" echo "gpg -b -a '$targz'" trurl-0.16.1/test.py0000664000000000000000000002276615010312005011230 0ustar00#!/usr/bin/env python3 ########################################################################## # _ _ ____ _ # Project ___| | | | _ \| | # / __| | | | |_) | | # | (__| |_| | _ <| |___ # \___|\___/|_| \_\_____| # # Copyright (C) Daniel Stenberg, , et al. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at https://curl.se/docs/copyright.html. # # You may opt to use, copy, modify, merge, publish, distribute and/or sell # copies of the Software, and permit persons to whom the Software is # furnished to do so, under the terms of the COPYING file. # # This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY # KIND, either express or implied. # # SPDX-License-Identifier: curl # ########################################################################## import sys from os import getcwd, path import json import shlex from subprocess import PIPE, run, Popen from dataclasses import dataclass, asdict from typing import Any, Optional, TextIO import locale PROGNAME = "trurl" TESTFILE = "tests.json" VALGRINDTEST = "valgrind" VALGRINDARGS = ["--error-exitcode=1", "--leak-check=full", "-q"] RED = "\033[91m" # used to mark unsuccessful tests NOCOLOR = "\033[0m" EXIT_SUCCESS = 0 EXIT_ERROR = 1 @dataclass class CommandOutput: stdout: Any returncode: int stderr: str def testComponent(value, exp): if isinstance(exp, bool): result = value == 0 or value not in ("", []) if exp: return result else: return not result return value == exp # checks if valgrind is installed def check_valgrind(): process = Popen(VALGRINDTEST + " --version", shell=True, stdout=PIPE, stderr=PIPE, encoding="utf-8") output, error = process.communicate() if output.startswith(VALGRINDTEST) and not len(error): return True return False def getcharmap(): process = Popen("locale charmap", shell=True, stdout=PIPE, stderr=PIPE, encoding="utf-8"); output, error = process.communicate() return output.strip() class TestCase: def __init__(self, testIndex, runnerCmd, baseCmd, **testCase): self.testIndex = testIndex self.runnerCmd = runnerCmd self.baseCmd = baseCmd self.arguments = testCase["input"]["arguments"] self.expected = testCase["expected"] self.commandOutput: CommandOutput = None self.testPassed: bool = False def runCommand(self, cmdfilter: Optional[str], runWithValgrind: bool): # Skip test if none of the arguments contain the keyword if cmdfilter and all(cmdfilter not in arg for arg in self.arguments): return False cmd = [self.baseCmd] args = self.arguments if self.runnerCmd != "": cmd = [self.runnerCmd] args = [self.baseCmd] + self.arguments elif runWithValgrind: cmd = [VALGRINDTEST] args = VALGRINDARGS + [self.baseCmd] + self.arguments output = run( cmd + args, stdout=PIPE, stderr=PIPE, encoding="utf-8" ) if isinstance(self.expected["stdout"], list): # if we don't expect string, parse to json try: stdout = json.loads(output.stdout) except json.decoder.JSONDecodeError: stdout = None else: stdout = output.stdout # assume stderr is always going to be string stderr = output.stderr # runners (e.g. wine) spill their own output into stderr, # ignore stderr tests when using a runner. if self.runnerCmd != "" and "stderr" in self.expected: stderr = self.expected["stderr"] self.commandOutput = CommandOutput(stdout, output.returncode, stderr) return True def test(self): # return true only if stdout, stderr and errorcode # are equal to the ones found in the testfile self.testPassed = all( testComponent(asdict(self.commandOutput)[k], exp) for k, exp in self.expected.items()) return self.testPassed def _printVerbose(self, output: TextIO): self._printConcise(output) for component, exp in self.expected.items(): value = asdict(self.commandOutput)[component] itemFail = self.commandOutput.returncode == 1 or \ not testComponent(value, exp) print(f"--- {component} --- ", file=output) print("expected:", file=output) print("nothing" if exp is False else "something" if exp is True else f"{exp!r}",file=output) print("got:", file=output) header = RED if itemFail else "" footer = NOCOLOR if itemFail else "" print(f"{header}{value!r}{footer}", file=output) print() def _printConcise(self, output: TextIO): if self.testPassed: header = "" result = "passed" footer = "" else: header = RED result = "failed" footer = NOCOLOR text = f"{self.testIndex}: {result}\t{shlex.join(self.arguments)}" print(f"{header}{text}{footer}", file=output) def printDetail(self, verbose: bool = False, failed: bool = False): output: TextIO = sys.stderr if failed else sys.stdout if verbose: self._printVerbose(output) else: self._printConcise(output) def main(argc, argv): ret = EXIT_SUCCESS baseDir = path.dirname(path.realpath(argv[0])) locale.setlocale(locale.LC_ALL, "") # python on windows does not always seem to find the # executable if it is in a different output directory than # the python script, even if it is in the current working # directory, using absolute paths to the executable and json # file makes it reliably find the executable baseCmd = path.join(getcwd(), PROGNAME) # the .exe on the end is necessary when using absolute paths if sys.platform == "win32" or sys.platform == "cygwin": baseCmd += ".exe" with open(path.join(baseDir, TESTFILE), "r", encoding="utf-8") as file: allTests = json.load(file) testIndexesToRun = [] # if argv[1] exists and starts with int cmdfilter = "" testIndexesToRun = list(range(len(allTests))) runWithValgrind = False verboseDetail = False runnerCmd = "" if argc > 1: for arg in argv[1:]: if arg[0].isnumeric(): # run only test cases separated by "," testIndexesToRun = [] for caseIndex in arg.split(","): testIndexesToRun.append(int(caseIndex)) elif arg == "--with-valgrind": runWithValgrind = True elif arg == "--verbose": verboseDetail = True elif arg.startswith("--trurl="): baseCmd = arg[len("--trurl="):] elif arg.startswith("--runner="): runnerCmd = arg[len("--runner="):] else: cmdfilter = argv[1] if runWithValgrind and not check_valgrind(): print(f'Error: {VALGRINDTEST} is not installed!', file=sys.stderr) return EXIT_ERROR # check if the trurl executable exists if path.isfile(baseCmd): # get the version info for the feature list args = ["--version"] if runnerCmd != "": cmd = [runnerCmd] args = [baseCmd] + args else: cmd = [baseCmd] output = run( cmd + args, stdout=PIPE, stderr=PIPE, encoding="utf-8" ) features = output.stdout.split('\n')[1].split()[1:] numTestsFailed = 0 numTestsPassed = 0 numTestsSkipped = 0 for testIndex in testIndexesToRun: # skip tests if required features are not met required = allTests[testIndex].get("required", None) if required and not set(required).issubset(set(features)): print(f"Missing feature, skipping test {testIndex + 1}.") numTestsSkipped += 1 continue encoding = allTests[testIndex].get("encoding", None) if encoding and encoding != getcharmap(): print(f"Invalid locale, skipping test {testIndex + 1}.") numTestsSkipped += 1 continue; test = TestCase(testIndex + 1, runnerCmd, baseCmd, **allTests[testIndex]) if test.runCommand(cmdfilter, runWithValgrind): if test.test(): # passed test.printDetail(verbose=verboseDetail) numTestsPassed += 1 else: test.printDetail(verbose=True, failed=True) numTestsFailed += 1 # finally print the results to terminal print("Finished:") result = ", ".join([ f"Failed: {numTestsFailed}", f"Passed: {numTestsPassed}", f"Skipped: {numTestsSkipped}", f"Total: {len(testIndexesToRun)}" ]) if (numTestsFailed == 0): print("Passed! - ", result) else: ret = f"Failed! - {result}" else: ret = f" error: File \"{baseCmd}\" not found!" return ret if __name__ == "__main__": sys.exit(main(len(sys.argv), sys.argv)) trurl-0.16.1/testfiles/0000755000000000000000000000000015010312005011662 5ustar00trurl-0.16.1/testfiles/test0000.txt0000644000000000000000000001004715010312005013704 0ustar00https://aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/ http://example.org trurl-0.16.1/testfiles/test0001.txt0000644000000000000000000000020615010312005013701 0ustar00https://curl.se/ https://docs.python.org/ git://github.com/curl/curl.git http://example.org xyz://hello/?hitrurl-0.16.1/testfiles/test0002.txt0000644000000000000000000000000015010312005013672 0ustar00trurl-0.16.1/tests.json0000664000000000000000000023674415010312005011737 0ustar00[ { "input": { "arguments": [ "example.com" ] }, "expected": { "stdout": "http://example.com/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "http://example.com" ] }, "expected": { "stdout": "http://example.com/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com" ] }, "expected": { "stdout": "https://example.com/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "hp://example.com" ] }, "expected": { "stdout": "hp://example.com/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [] }, "expected": { "stdout": "", "stderr": "trurl error: not enough input for a URL\ntrurl error: Try trurl -h for help\n", "returncode": 7 } }, { "input": { "arguments": [ "ftp.example.com" ] }, "expected": { "stdout": "ftp://ftp.example.com/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com/../moo" ] }, "expected": { "stdout": "https://example.com/moo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com/.././moo" ] }, "expected": { "stdout": "https://example.com/moo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com/test/../moo" ] }, "expected": { "stdout": "https://example.com/moo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "localhost", "--append", "path=moo" ] }, "expected": { "stdout": "http://localhost/moo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "localhost", "-a", "path=moo" ] }, "expected": { "stdout": "http://localhost/moo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--set", "host=moo", "--set", "scheme=http" ] }, "expected": { "stdout": "http://moo/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "-shost=moo", "-sscheme=http" ] }, "expected": { "stdout": "http://moo/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--set=host=moo", "--set=scheme=http" ] }, "expected": { "stdout": "http://moo/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "-s", "host=moo", "-s", "scheme=http" ] }, "expected": { "stdout": "http://moo/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--set", "host=moo", "--set", "scheme=https", "--set", "port=999" ] }, "expected": { "stdout": "https://moo:999/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--set", "host=moo", "--set", "scheme=ftps", "--set", "path=/hello" ] }, "expected": { "stdout": "ftps://moo/hello\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se", "--set", "host=example.com" ] }, "expected": { "stdout": "https://example.com/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--set", "host=example.com", "--set", "scheme=ftp" ] }, "expected": { "stdout": "ftp://example.com/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/we/are.html", "--redirect", "here.html" ] }, "expected": { "stdout": "https://curl.se/we/here.html\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/we/../are.html", "--set", "port=8080" ] }, "expected": { "stdout": "https://curl.se:8080/are.html\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "imap://curl.se:22/", "-s", "port=143" ] }, "required": ["imap-options"], "expected": { "stdout": "imap://curl.se/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--keep-port", "https://curl.se:22/", "-s", "port=443" ] }, "expected": { "stdout": "https://curl.se:443/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--keep-port", "https://curl.se:22/", "-s", "port=443", "--get", "{url}" ] }, "expected": { "stdout": "https://curl.se:443/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/we/are.html", "--get", "{path}" ] }, "expected": { "stdout": "/we/are.html\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/we/are.html", "-g{path}" ] }, "expected": { "stdout": "/we/are.html\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--default-port", "--url", "imap://curl.se/we/are.html", "--get", "{port}" ] }, "required": ["imap-options"], "expected": { "stdout": "143\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/we/are.html", "--get", "{scheme}" ] }, "expected": { "stdout": "https\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/we/are.html", "--get", "{:scheme}" ] }, "expected": { "stdout": "https\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se:55/we/are.html", "--get", "{url:port}" ] }, "expected": { "stdout": "55\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/%2e%61%13", "--get", "{:path}" ] }, "expected": { "stdout": "/.a%13\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se?%2e%61%13", "--get", "{:query}" ] }, "expected": { "stdout": ".a%13\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/#%2e%61%13", "--get", "{:fragment}" ] }, "expected": { "stdout": ".a%13\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://example.com/#%2e%61%13%Fa" ] }, "expected": { "stdout": "https://example.com/#.a%13%fa\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://hello@curl.se/we/are.html", "--get", "{user}" ] }, "expected": { "stdout": "hello\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://hello:secret@curl.se/we/are.html", "--get", "{password}" ] }, "expected": { "stdout": "secret\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "imap://hello:secret;crazy@curl.se/we/are.html", "--get", "{options}" ] }, "required": ["imap-options"], "expected": { "stdout": "crazy\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/we/are.html", "--get", "{host}" ] }, "expected": { "stdout": "curl.se\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://10.1/we/are.html", "--get", "{host}" ] }, "required": ["normalize-ipv4"], "expected": { "stdout": "10.0.0.1\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://[fe80::0000:20c:29ff:fe9c:409b]:8080/we/are.html", "--get", "{host}" ] }, "required": ["zone-id"], "expected": { "stdout": "[fe80::20c:29ff:fe9c:409b]\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://[fe80::0000:20c:29ff:fe9c:409b%euth0]:8080/we/are.html", "--get", "{zoneid}" ] }, "expected": { "stdout": "euth0\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://[fe80::0000:20c:29ff:fe9c:409b%eth0]:8080/we/are.html", "--get", "{zoneid}" ] }, "expected": { "stdout": "eth0\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/we/are.html?user=many#more", "--get", "{query}" ] }, "expected": { "stdout": "user=many\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/we/are.html?user=many#more", "--get", "{fragment}" ] }, "expected": { "stdout": "more\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "imap://curl.se/we/are.html", "-g", "{default:port}" ] }, "required": ["imap-options"], "expected": { "stdout": "143\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/hello", "--append", "path=you" ] }, "expected": { "stdout": "https://curl.se/hello/you\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/hello", "--append", "path=you index.html" ] }, "expected": { "stdout": "https://curl.se/hello/you%20index.html\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se?name=hello", "--append", "query=search=string" ] }, "expected": { "stdout": "https://curl.se/?name=hello&search=string\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/hello", "--set", "user=:hej:" ] }, "expected": { "stdout": "https://%3ahej%3a@curl.se/hello\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/hello", "--set", "user=hej", "--set", "password=secret" ] }, "expected": { "stdout": "https://hej:secret@curl.se/hello\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/hello", "--set", "query:=user=me" ] }, "expected": { "stdout": "https://curl.se/hello?user=me\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/hello", "--set", "query=user=me" ] }, "expected": { "stdout": "https://curl.se/hello?user%3dme\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/hello", "--set", "fragment= hello" ] }, "expected": { "stdout": "https://curl.se/hello#%20hello\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://curl.se/hello", "--set", "fragment:=%20hello" ] }, "expected": { "stdout": "https://curl.se/hello#%20hello\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "localhost", "--append", "query=hello=foo" ] }, "expected": { "stdout": "http://localhost/?hello=foo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "localhost", "-a", "query=hello=foo" ] }, "expected": { "stdout": "http://localhost/?hello=foo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?search=hello&utm_source=tracker", "--trim", "query=utm_*" ] }, "expected": { "stdout": "https://example.com/?search=hello\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?search=hello&utm_source=tracker", "--qtrim", "utm_*" ] }, "expected": { "stdout": "https://example.com/?search=hello\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?search=hello&utm_source=tracker&more=data", "--trim", "query=utm_*" ] }, "expected": { "stdout": "https://example.com/?search=hello&more=data\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?search=hello&utm_source=tracker&more=data", "--qtrim", "utm_*" ] }, "expected": { "stdout": "https://example.com/?search=hello&more=data\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?search=hello&more=data", "--qtrim", "utm_*" ] }, "expected": { "stdout": "https://example.com/?search=hello&more=data\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?utm_source=tracker", "--trim", "query=utm_*" ] }, "expected": { "stdout": "https://example.com/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?search=hello&utm_source=tracker&more=data", "--qtrim", "utm_source" ] }, "expected": { "stdout": "https://example.com/?search=hello&more=data\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?search=hello&utm_source=tracker&more=data", "--qtrim", "utm_source", "--qtrim", "more", "--qtrim", "search" ] }, "expected": { "stdout": "https://example.com/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--accept-space", "--url", "gopher://localhost/ with space" ] }, "required": ["white-space"], "expected": { "stdout": "gopher://localhost/%20with%20space\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--accept-space", "--url", "https://localhost/?with space" ] }, "required": ["white-space"], "expected": { "stdout": "https://localhost/?with+space\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://daniel@curl.se:22/", "-s", "port=", "-s", "user=" ] }, "expected": { "stdout": "https://curl.se/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?moo&search=hello", "--qtrim", "search" ] }, "expected": { "stdout": "https://example.com/?moo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?search=hello&moo", "--qtrim", "search" ] }, "expected": { "stdout": "https://example.com/?moo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?search=hello", "--qtrim", "search", "--append", "query=moo" ] }, "expected": { "stdout": "https://example.com/?moo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--keep-port", "https://hello:443/foo" ] }, "expected": { "stdout": "https://hello:443/foo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--keep-port", "ftp://hello:21/foo" ] }, "expected": { "stdout": "ftp://hello:21/foo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://hello:443/foo", "-s", "scheme=ftp" ] }, "expected": { "stdout": "ftp://hello:443/foo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--keep-port", "ftp://hello:443/foo", "-s", "scheme=https" ] }, "expected": { "stdout": "https://hello:443/foo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?utm_source=tra%20cker&address%20=home&here=now&thisthen", "-g", "{query:utm_source}" ] }, "expected": { "stdout": "tra cker\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?utm_source=tra%20cker&address%20=home&here=now&thisthen", "-g", "{:query:utm_source}" ] }, "expected": { "stdout": "tra+cker\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?utm_source=tra%20cker&address%20=home&here=now&thisthen", "-g", "{:query:utm_}" ] }, "expected": { "stdout": "\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?utm_source=tra%20cker&address%20=home&here=now&thisthen", "-g", "{:query:UTM_SOURCE}" ] }, "expected": { "stdout": "\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?utm_source=tracker&monkey=123", "--sort-query" ] }, "expected": { "stdout": "https://example.com/?monkey=123&utm_source=tracker\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?a=b&c=d&", "--sort-query" ] }, "expected": { "stdout": "https://example.com/?a=b&c=d\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?a=b&c=d&", "--sort-query", "--trim", "query=a" ] }, "expected": { "stdout": "https://example.com/?c=d\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "example.com:29", "--set", "port=" ] }, "expected": { "stdout": "http://example.com/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "HTTPS://example.com" ] }, "expected": { "stdout": "https://example.com/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://EXAMPLE.com" ] }, "expected": { "stdout": "https://EXAMPLE.com/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "https://example.com/FOO/BAR" ] }, "expected": { "stdout": "https://example.com/FOO/BAR\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "[2001:0db8:0000:0000:0000:ff00:0042:8329]" ] }, "required": ["zone-id"], "expected": { "stdout": "http://[2001:db8::ff00:42:8329]/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?utm=tra%20cker:address%20=home:here=now:thisthen", "--sort-query", "--query-separator", ":" ] }, "expected": { "stdout": "https://example.com/?address+=home:here=now:thisthen:utm=tra+cker\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "foo?a=bCd=eCe=f", "--query-separator", "C", "--trim", "query=d" ] }, "expected": { "stdout": "http://foo/?a=bCe=f\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "localhost", "-g", "{scheme} {host" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": "http {host\n" } }, { "input": { "arguments": [ "localhost", "-g", "[scheme] [host" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": "http [host\n" } }, { "input": { "arguments": [ "localhost", "-g", "\\{{scheme}\\[" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": "{http[\n" } }, { "input": { "arguments": [ "localhost", "-g", "\\\\[" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": "\\[\n" } }, { "input": { "arguments": [ "https://u:s@foo?moo", "-g", "[scheme][user][password][query]" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": "httpsusmoo\n" } }, { "input": { "arguments": [ "hej?a=b&a=c&a=d&b=a", "-g", "{query-all:a}" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": "b c d\n" } }, { "input": { "arguments": [ "https://curl.se?name=mr%00smith", "--get", "{query:name}" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": "mr.smith\n" } }, { "input": { "arguments": [ "--keep-port", "https://curl.se", "--iterate", "port=80 81 443" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": "https://curl.se:80/\nhttps://curl.se:81/\nhttps://curl.se:443/\n" } }, { "input": { "arguments": [ "https://curl.se", "--iterate", "port=81 443", "--iterate", "scheme=sftp moo" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": "sftp://curl.se:81/\nmoo://curl.se:81/\nsftp://curl.se:443/\nmoo://curl.se:443/\n" } }, { "input": { "arguments": [ "https://curl.se", "--iterate", "port=81 443", "--iterate", "scheme=sftp moo", "--iterate", "port=2 1" ] }, "expected": { "stderr": "trurl error: duplicate component for iterate: port\ntrurl error: Try trurl -h for help\n", "returncode": 11, "stdout": "" } }, { "input": { "arguments": [ "https://curl.se", "-s", "host=localhost", "--iterate", "port=22 23" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": "https://localhost:22/\nhttps://localhost:23/\n" } }, { "input": { "arguments": [ "hello@localhost", "--iterate", "host=one two", "-g", "{host} {user}" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": "one hello\ntwo hello\n" } }, { "input": { "arguments": [ "https://example.com?utm=tra%20cker&address%20=home&here=now&thisthen", "--json" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": [ { "url": "https://example.com/?utm=tra+cker&address+=home&here=now&thisthen", "parts": { "scheme": "https", "host": "example.com", "path": "/", "query": "utm=tra cker&address =home&here=now&thisthen" }, "params": [ { "key": "utm", "value": "tra cker" }, { "key": "address ", "value": "home" }, { "key": "here", "value": "now" }, { "key": "thisthen", "value": "" } ] } ] } }, { "input": { "arguments": [ "ftp://smith:secret@example.com:33/path?search=me#where", "--json" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": [ { "url": "ftp://smith:secret@example.com:33/path?search=me#where", "parts": { "scheme": "ftp", "user": "smith", "password": "secret", "host": "example.com", "port": "33", "path": "/path", "query": "search=me", "fragment": "where" }, "params": [ { "key": "search", "value": "me" } ] } ] } }, { "input": { "arguments": [ "example.com", "--json" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": [ { "url": "http://example.com/", "parts": { "scheme": "http", "host": "example.com", "path": "/" } } ] } }, { "input": { "arguments": [ "example.com", "other.com", "--json" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": [ { "url": "http://example.com/", "parts": { "scheme": "http", "host": "example.com", "path": "/" } }, { "url": "http://other.com/", "parts": { "scheme": "http", "host": "other.com", "path": "/" } } ] } }, { "input": { "arguments": [ "localhost", "--iterate", "host=one two", "--json" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": [ { "url": "http://one/", "parts": { "scheme": "http", "host": "one", "path": "/" } }, { "url": "http://two/", "parts": { "scheme": "http", "host": "two", "path": "/" } } ] } }, { "input": { "arguments": [ "--json", "-s", "scheme=irc", "-s", "host=curl.se" ] }, "expected": { "stdout": [ { "url": "irc://curl.se/", "parts": { "scheme": "irc", "host": "curl.se", "path": "/" } } ], "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "--json", "-s", "host=curl.se" ] }, "expected": { "stdout": [], "returncode": 7, "stderr": "trurl error: not enough input for a URL\ntrurl error: Try trurl -h for help\n" } }, { "input": { "arguments": [ "--verify", "--json", "ftp://example.org", "", "git://curl.se/" ] }, "expected": { "stdout": [ { "url": "ftp://example.org/", "parts": { "scheme": "ftp", "host": "example.org", "path": "/" } } ], "returncode": 9, "stderr": true } }, { "input": { "arguments": [ "-s", "scheme=imap" ] }, "expected": { "stdout": "", "returncode": 7, "stderr": "trurl error: not enough input for a URL\ntrurl error: Try trurl -h for help\n" } }, { "input": { "arguments": [ "-g", "{query:}", "http://localhost/?=bar" ] }, "expected": { "stdout": "bar\n", "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "--json", "https://curl.se/?&&&" ] }, "expected": { "stdout": [ { "url": "https://curl.se/?&&&", "parts": { "scheme": "https", "host": "curl.se", "path": "/", "query": "&&&" }, "params": [] } ], "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "--json", "--trim", "query=f*", "localhost?foo&bar=ar" ] }, "expected": { "stdout": [ { "url": "http://localhost/?bar=ar", "parts": { "scheme": "http", "host": "localhost", "path": "/", "query": "bar=ar" }, "params": [ { "key": "bar", "value": "ar" } ] } ], "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "https://example.com?search=hello&utm_source=tracker&utm_block&testing", "--trim", "query=utm_*", "--json" ] }, "expected": { "stdout": [ { "url": "https://example.com/?search=hello&testing", "parts": { "scheme": "https", "host": "example.com", "path": "/", "query": "search=hello&testing" }, "params": [ { "key": "search", "value": "hello" }, { "key": "testing", "value": "" } ] } ], "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "https://räksmörgås.se", "-g", "{default:puny:url}" ] }, "required": ["punycode"], "encoding": "UTF-8", "expected": { "stderr": "", "returncode": 0, "stdout": "https://xn--rksmrgs-5wao1o.se:443/\n" } }, { "input": { "arguments": [ "https://räksmörgås.se", "-g", "{puny:url}" ] }, "required": ["punycode"], "encoding": "UTF-8", "expected": { "stderr": "", "returncode": 0, "stdout": "https://xn--rksmrgs-5wao1o.se/\n" } }, { "input": { "arguments": [ "https://räksmörgås.se", "-g", "{puny:host}" ] }, "required": ["punycode"], "encoding": "UTF-8", "expected": { "stderr": "", "returncode": 0, "stdout": "xn--rksmrgs-5wao1o.se\n" } }, { "input": { "arguments": [ "imap://user:password;crazy@[ff00::1234%hello]:1234/path?a=b&c=d#fragment", "--json" ] }, "required": ["imap-options"], "expected": { "returncode": 0, "stdout": [ { "url": "imap://user:password;crazy@[ff00::1234%25hello]:1234/path?a=b&c=d#fragment", "parts": { "scheme": "imap", "user": "user", "password": "password", "options": "crazy", "host": "[ff00::1234]", "port": "1234", "path": "/path", "query": "a=b&c=d", "fragment": "fragment", "zoneid": "hello" }, "params": [ { "key": "a", "value": "b" }, { "key": "c", "value": "d" } ] } ] } }, { "input": { "arguments": [ "imap://example.com/", "--get", "port: {port}, default:port: {default:port}" ] }, "required": ["imap-options"], "expected": { "returncode": 0, "stdout": "port: , default:port: 143\n" } }, { "input": { "arguments": [ "http://example.com:8080/", "--get", "port: {port}, default:port: {default:port}" ] }, "expected": { "returncode": 0, "stdout": "port: 8080, default:port: 8080\n" } }, { "input": { "arguments": [ "localhost", "-s", "host=foo", "--iterate", "host=bar baz" ] }, "expected": { "stdout": "", "returncode": 11, "stderr": "trurl error: duplicate --iterate and --set for component host\ntrurl error: Try trurl -h for help\n" } }, { "input": { "arguments": [ "emanuele6://curl.se/trurl", "", "https://example.org" ] }, "expected": { "stdout": "emanuele6://curl.se/trurl\nhttps://example.org/\n", "returncode": 0, "stderr": true } }, { "input": { "arguments": [ "--verify", "--no-guess-scheme", "hello" ] }, "expected": { "stdout": "", "returncode": 9, "stderr": "trurl error: Bad scheme [hello]\ntrurl error: Try trurl -h for help\n" } }, { "input": { "arguments": [ "--verify", "-f", "testfiles/test0000.txt" ] }, "expected": { "stdout": "http://example.org/\n", "returncode": 0, "stderr": "trurl note: skipping long line\n" } }, { "input": { "arguments": [ "-f", "testfiles/test0001.txt" ] }, "expected": { "stdout": "https://curl.se/\nhttps://docs.python.org/\ngit://github.com/curl/curl.git\nhttp://example.org/\nxyz://hello/?hi\n", "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "--no-guess-scheme", "foo", "hi", "https://example.org", "hey", "git://curl.se" ] }, "expected": { "stdout": "https://example.org/\ngit://curl.se/\n", "returncode": 0, "stderr": "trurl note: Bad scheme [foo]\ntrurl note: Bad scheme [hi]\ntrurl note: Bad scheme [hey]\n" } }, { "input": { "arguments": [ "-f", "testfiles/test0002.txt", "--json" ] }, "expected": { "stdout": "[]\n", "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "--accept-space", "-s", "query:=x=10&x=2 3", "localhost" ] }, "required": ["white-space"], "expected": { "stdout": "http://localhost/?x=10&x=2+3\n", "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "-s", "path=\\\\", "-g", "{path}\\n{:path}", "localhost" ] }, "expected": { "stdout": "/\\\\\n/%5c%5c\n", "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "-s", "path=\\\\", "--json", "localhost" ] }, "expected": { "stdout": [ { "url": "http://localhost/%5c%5c", "parts": { "scheme": "http", "host": "localhost", "path": "/\\\\" } } ], "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "-s", "path=\\\\", "-g", "{path}\\n{:path}", "--urlencode", "localhost" ] }, "expected": { "stdout": "/%5c%5c\n/%5c%5c\n", "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "-s", "path=abc\\\\", "-s", "query:=a&b&a%26b", "--urlencode", "--json", "localhost" ] }, "expected": { "stdout": [ { "url": "http://localhost/abc%5c%5c?a&b&a%26b", "parts": { "scheme": "http", "host": "localhost", "path": "/abc%5c%5c", "query": "a&b&a%26b" }, "params": [ { "key": "a", "value": "" }, { "key": "b", "value": "" }, { "key": "a&b", "value": "" } ] } ], "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "-s", "scheme:=http", "-s", "host:=localhost", "-s", "path:=/ABC%5C%5C", "-s", "query:=a&b&a%26b" ] }, "expected": { "stdout": "http://localhost/ABC%5c%5c?a&b&a%26b\n", "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "-g", "{query:b}\\t{query-all:a}\\n{:query:b}\\t{:query-all:a}", "https://example.org/foo?a=1&b=%23&a=%26#hello" ] }, "expected": { "stdout": "#\t1 &\n%23\t1 %26\n", "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "--urlencode", "-g", "{query:b}\\t{query-all:a}\\n{:query:b}\\t{:query-all:a}", "https://example.org/foo?a=1&b=%23&a=%26#hello" ] }, "expected": { "stdout": "%23\t1 %26\n%23\t1 %26\n", "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "-a", "query=c=moo", "--sort-query", "https://example.org/foo?x=hi#rye" ] }, "expected": { "stdout": "https://example.org/foo?c=moo&x=hi#rye\n", "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "--qtrim", "a", "-a", "query=a=ciao", "-a", "query=b=salve", "https://example.org/foo?a=hi&b=hello&x=y" ] }, "expected": { "stdout": "https://example.org/foo?b=hello&x=y&a=ciao&b=salve\n", "returncode": 0, "stderr": "" } }, { "input" : { "arguments": [ "http://example.com/?q=mr%00smith", "--json", "--urlencode" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": [ { "url": "http://example.com/?q=mr%00smith", "parts": { "scheme": "http", "host": "example.com", "path": "/", "query": "q=mr%00smith" }, "params": [ { "key": "q", "value": "mr\u0000smith" } ] } ] } }, { "input" : { "arguments": [ "http://example.com/?q=mr%00sm%00ith", "--json", "--urlencode" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": [ { "url": "http://example.com/?q=mr%00sm%00ith", "parts": { "scheme": "http", "host": "example.com", "path": "/", "query": "q=mr%00sm%00ith" }, "params": [ { "key": "q", "value": "mr\u0000sm\u0000ith" } ] } ] } }, { "input" : { "arguments": [ "http://example.com/?q=mr%00%00%00smith", "--json", "--urlencode" ] }, "expected": { "stderr": "", "returncode": 0, "stdout": [ { "url": "http://example.com/?q=mr%00%00%00smith", "parts": { "scheme": "http", "host": "example.com", "path": "/", "query": "q=mr%00%00%00smith" }, "params": [ { "key": "q", "value": "mr\u0000\u0000\u0000smith" } ] } ] } }, { "input": { "arguments": [ "--url", "https://curl.se/we/are.html?*=moo&user=many#more", "--qtrim", "\\*" ] }, "expected": { "stdout": "https://curl.se/we/are.html?user=many#more\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "http://xn--rksmrgs-5wao1o/", "--as-idn" ] }, "required": ["punycode2idn"], "encoding": "UTF-8", "expected": { "stdout": "http://räksmörgås/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "http://xn--rksmrgs-5wao1o/", "-g", "{idn:host}" ] }, "required": ["punycode2idn"], "encoding": "UTF-8", "expected": { "stdout": "räksmörgås\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "http://xn-----/", "--as-idn", "--quiet" ] }, "required": ["punycode2idn"], "encoding": "UTF-8", "expected": { "stdout": "http://xn-----/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "http://xn-----/", "--as-idn" ] }, "required": ["punycode2idn"], "encoding": "UTF-8", "expected": { "stdout": "http://xn-----/\n", "stderr": "trurl note: Error converting url to IDN [Bad hostname]\n", "returncode": 0 } }, { "input": { "arguments": [ "--verify", "-f", "testfiles/test0000.txt", "--quiet" ] }, "expected": { "stdout": "http://example.org/\n", "returncode": 0, "stderr": "" } }, { "input": { "arguments": [ "--curl", "--verify", "foo://bar" ] }, "expected": { "stdout": "", "stderr": true, "returncode": 9 } }, { "input": { "arguments": [ "http://test.org/?key=val", "--replace", "key=foo" ] }, "expected": { "stdout": "http://test.org/?key=foo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "http://test.org/?that=thing&key=val", "--replace", "key=foo" ] }, "expected": { "stdout": "http://test.org/?that=thing&key=foo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "http://test.org/?that=thing&key", "--replace", "key=foo" ] }, "expected": { "stdout": "http://test.org/?that=thing&key=foo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "http://test.org/?that=thing&key=foo", "--replace", "key" ] }, "expected": { "stdout": "http://test.org/?that=thing&key\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com?a=123&b=321&b=987", "--replace", "b=foo" ] }, "expected": { "stdout": "https://example.com/?a=123&b=foo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "example.org/?quest=best", "--replace", "quest=%00", "--json", "--urlencode" ] }, "expected": { "stdout": [{ "url": "http://example.org/?quest=%2500", "parts": { "scheme": "http", "host": "example.org", "path": "/", "query": "quest=%2500" }, "params": [ { "key": "quest", "value": "%00" } ] }], "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "example.com", "--replace" ] }, "expected": { "stderr": "trurl error: No data passed to replace component\ntrurl error: Try trurl -h for help\n", "stdout":"", "returncode": 12 } }, { "input": { "arguments": [ "http://test.org/?that=thing", "--force-replace", "key=foo" ] }, "expected": { "stdout": "http://test.org/?that=thing&key=foo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "http://test.org/?that=thing", "--replace-append", "key=foo" ] }, "expected": { "stdout": "http://test.org/?that=thing&key=foo\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "0?00%000000000000000000000=0000000000" ] }, "expected": { "stdout": "http://0.0.0.0/?00%000000000000000000000=0000000000\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--json", "0?0%000000000000000000000000000000000", "--urlencode" ] }, "expected": { "returncode": 0, "stderr": "", "stdout": [ { "url": "http://0.0.0.0/?0%000000000000000000000000000000000", "parts": { "scheme": "http", "host": "0.0.0.0", "path": "/", "query": "0%000000000000000000000000000000000" }, "params": [ { "key": "0\u00000000000000000000000000000000000", "value": "" } ] } ] } }, { "input": { "arguments": [ "--json", "0?0%000000000000000000000000000000000=000%0000000000", "--urlencode" ] }, "expected": { "returncode": 0, "stderr": "", "stdout": [ { "url": "http://0.0.0.0/?0%000000000000000000000000000000000=000%0000000000", "parts": { "scheme": "http", "host": "0.0.0.0", "path": "/", "query": "0%000000000000000000000000000000000=000%0000000000" }, "params": [ { "key": "0\u00000000000000000000000000000000000", "value": "000\u000000000000" } ] } ] } }, { "input": { "arguments": [ "example.com", "--set", "host=[::1]" ] }, "expected": { "stdout": "http://[::1]/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "example.com:88", "--set", "port?=99" ] }, "expected": { "stdout": "http://example.com:88/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "example.com", "--set", "port?=99" ] }, "expected": { "stdout": "http://example.com:99/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "example.com", "--append", "query=add", "--iterate", "scheme=http ftp" ] }, "expected": { "stdout": "http://example.com/?add\nftp://example.com/?add\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "example.com", "--append", "path=add", "--iterate", "scheme=http ftp" ] }, "expected": { "stdout": "http://example.com/add\nftp://example.com/add\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "example.com", "--append", "path=add", "--append", "path=two" ] }, "expected": { "stdout": "http://example.com/add/two\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://curl.se?name=mr%00smith", "--get", "{query}" ] }, "expected": { "stderr": "trurl note: URL decode error, most likely because of rubbish in the input (query)\n", "returncode": 0, "stdout": "\n" } }, { "input": { "arguments": [ "https://curl.se?name=mr%00smith", "--get", "{strict:query}" ] }, "expected": { "stderr": "trurl error: problems URL decoding query\ntrurl error: Try trurl -h for help\n", "returncode": 10, "stdout": "" } }, { "required": ["no-guess-scheme"], "input": { "arguments": [ "example.com", "--set", "scheme?=https" ] }, "expected": { "stdout": "https://example.com/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "ftp://example.com", "--set", "scheme?=https" ] }, "expected": { "stdout": "ftp://example.com/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "http://example.org/%18", "--json" ] }, "expected": { "stdout":[ { "url": "http://example.org/%18", "parts": { "scheme": "http", "host": "example.org", "path": "/\u0018" } } ], "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "http://example.org/%18", "--json", "--urlencode" ] }, "expected": { "stdout":[ { "url": "http://example.org/%18", "parts": { "scheme": "http", "host": "example.org", "path": "/%18" } } ], "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com/one/t%61o/%2F%42/" ] }, "expected": { "stdout": "https://example.com/one/tao/%2fB/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com/one/t%61o/%2F%42/", "--append", "path=%61" ] }, "expected": { "stdout": "https://example.com/one/tao/%2fB/%2561\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://ex%61mple.com/h%61s/?wh%61t" ] }, "expected": { "stdout": "https://example.com/has/?what\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com/", "--get", "{must:query}" ] }, "expected": { "stdout": "", "stderr": "trurl error: missing must:query\ntrurl error: Try trurl -h for help\n", "returncode": 10 } }, { "input": { "arguments": [ "https://example.com/?", "--get", "{must:query}" ] }, "required": ["get-empty"], "expected": { "stdout": "\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "https://example.com/", "--get", "{must:fragment}" ] }, "expected": { "stdout": "", "stderr": "trurl error: missing must:fragment\ntrurl error: Try trurl -h for help\n", "returncode": 10 } }, { "input": { "arguments": [ "http://example.org/%18", "--get", "{path}" ] }, "expected": { "stdout": "/\u0018\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "http://example.org/?a=&b=1" ] }, "expected": { "stdout": "http://example.org/?a=&b=1\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "http://example.org/?a=1&b=" ] }, "expected": { "stdout": "http://example.org/?a=1&b=\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "http://example.org/?a=1&b=&c=2" ] }, "expected": { "stdout": "http://example.org/?a=1&b=&c=2\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "http://example.org/?a=1&b=&c=2", "--json" ] }, "expected": { "stdout":[ { "url": "http://example.org/?a=1&b=&c=2", "parts": { "scheme": "http", "host": "example.org", "path": "/", "query": "a=1&b=&c=2" }, "params": [ { "key": "a", "value": "1" }, { "key": "b", "value": "" }, { "key": "c", "value": "2" } ] } ], "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "http://example.org/?=1&b=2&c=&=3" ] }, "expected": { "stdout": "http://example.org/?=1&b=2&c=&=3\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "http://example.com/?a=%5D" ] }, "expected": { "stdout": "http://example.com/?a=%5d\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "http://example.com/?a=%5D&b=%5D" ] }, "expected": { "stdout": "http://example.com/?a=%5d&b=%5d\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "sftp://us%65r:pwd;giraffe@odd" ] }, "expected": { "stdout": "sftp://user:pwd%3bgiraffe@odd/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "imap://us%65r:pwd;gir%41ffe@odd" ] }, "expected": { "stdout": "imap://user:pwd;girAffe@odd/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "sftp://us%65r:pwd;giraffe@odd", "--get", "[password]" ] }, "expected": { "stdout": "pwd;giraffe\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "sftp://us%65r:pwd;giraffe@odd", "--get", "[:password]" ] }, "expected": { "stdout": "pwd%3bgiraffe\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "--url", "http://åäö/", "--punycode", "-s", "port=21" ] }, "required": ["punycode"], "encoding": "UTF-8", "expected": { "stdout": "http://xn--4cab6c:21/\n", "stderr": "", "returncode": 0 } }, { "input": { "arguments": [ "sftp://odd", "--set", "port=144", "--set", "port=145" ] }, "expected": { "stdout": "", "stderr": "trurl error: duplicate --set for component port\ntrurl error: Try trurl -h for help\n", "returncode": 5 } }, { "input": { "arguments": [ "sftp://odd", "--get", "[port]", "--get", "{port}" ] }, "expected": { "stdout": "", "stderr": "trurl error: only one --get is supported\ntrurl error: Try trurl -h for help\n", "returncode": 4 } }, { "input": { "arguments": [ "url", "-f", "testfiles/test0000.txt", "-f", "testfiles/test0000.txt" ] }, "expected": { "stdout": "", "returncode": 4, "stderr": "trurl error: only one --url-file is supported\ntrurl error: Try trurl -h for help\n" } }, { "input": { "arguments": [ "--url" ] }, "expected": { "stdout": "", "stderr": "trurl error: Missing argument for --url\ntrurl error: Try trurl -h for help\n", "returncode": 3 } }, { "input": { "arguments": [ "url", "--set" ] }, "expected": { "stdout": "", "stderr": "trurl error: Missing argument for --set\ntrurl error: Try trurl -h for help\n", "returncode": 3 } }, { "input": { "arguments": [ "url", "--redirect" ] }, "expected": { "stdout": "", "stderr": "trurl error: Missing argument for --redirect\ntrurl error: Try trurl -h for help\n", "returncode": 3 } }, { "input": { "arguments": [ "url", "--get" ] }, "expected": { "stdout": "", "stderr": "trurl error: Missing argument for --get\ntrurl error: Try trurl -h for help\n", "returncode": 3 } }, { "input": { "arguments": [ "url", "--replace" ] }, "expected": { "stdout": "", "stderr": "trurl error: No data passed to replace component\ntrurl error: Try trurl -h for help\n", "returncode": 12 } }, { "input": { "arguments": [ "url", "--replace-append" ] }, "expected": { "stdout": "", "stderr": "trurl error: No data passed to replace component\ntrurl error: Try trurl -h for help\n", "returncode": 12 } }, { "input": { "arguments": [ "url", "--append" ] }, "expected": { "stdout": "", "stderr": "trurl error: Missing argument for --append\ntrurl error: Try trurl -h for help\n", "returncode": 3 } }, { "input": { "arguments": [ "url", "--query-separator", "''" ] }, "expected": { "stdout": "", "stderr": "trurl error: only single-letter query separators are supported\ntrurl error: Try trurl -h for help\n", "returncode": 4 } }, { "input": { "arguments": [ "url", "--query-separator", "aa" ] }, "expected": { "stdout": "", "stderr": "trurl error: only single-letter query separators are supported\ntrurl error: Try trurl -h for help\n", "returncode": 4 } }, { "input": { "arguments": [ "url", "--json", "--get", "{port}" ] }, "expected": { "stdout": "", "stderr": "trurl error: --get is mutually exclusive with --json\ntrurl error: Try trurl -h for help\n", "returncode": 4 } }, { "input": { "arguments": [ "url", "--get", "{port}", "--json" ] }, "expected": { "stdout": "", "stderr": "trurl error: --json is mutually exclusive with --get\ntrurl error: Try trurl -h for help\n", "returncode": 4 } }, { "input": { "arguments": [ "e?e&&" ] }, "expected": { "stdout": "http://e/?e\n", "returncode": 0 } }, { "input": { "arguments": [ "e?e&" ] }, "expected": { "stdout": "http://e/?e\n", "returncode": 0 } }, { "input": { "arguments": [ "e?e&&&&&&&&&&&&&&&&&&&&&" ] }, "expected": { "stdout": "http://e/?e\n", "returncode": 0 } }, { "input": { "arguments": [ "e?e&&&&&&&&&&a&&&&&&&&&&&" ] }, "expected": { "stdout": "http://e/?e&a\n", "returncode": 0 } } ] trurl-0.16.1/trurl.10000664000000000000000000006053115010312005011121 0ustar00.\" generated by cd2nroff 0.1 from trurl.md .TH trurl 1 "2025-05-12" trurl 0.16.1 .SH NAME trurl \- transpose URLs .SH SYNOPSIS \fBtrurl [options / URLs]\fP .SH DESCRIPTION \fBtrurl\fP parses, manipulates and outputs URLs and parts of URLs. It uses the RFC 3986 definition of URLs and it uses libcurl\(aqs URL parser to do so, which includes a few "extensions". The URL support is limited to \&"hierarchical" URLs, the ones that use \fI://\fP separators after the scheme. Typically you pass in one or more URLs and decide what of that you want output. Possibly modifying the URL as well. trurl knows URLs and every URL consists of up to ten separate and independent \fIcomponents\fP. These components can be extracted, removed and updated with trurl and they are referred to by their respective names: scheme, user, password, options, host, port, path, query, fragment and zoneid. .SH NORMALIZATION When provided a URL to work with, trurl "normalizes" it. It means that individual URL components are URL decoded then URL encoded back again and set in the URL. Example: .nf $ trurl 'http://ex%61mple:80/%62ath/a/../b?%2e%FF#tes%74' http://example/bath/b?.%ff#test .fi .SH OPTIONS Options start with one or two dashes. Many of the options require an additional value next to them. Any other argument is interpreted as a URL argument, and is treated as if it was following a \fI\--url\fP option. The first argument that is exactly two dashes (\fI\--\fP), marks the end of options; any argument after the end of options is interpreted as a URL argument even if it starts with a dash. Long options can be provided either as \fI\--flag argument\fP or as \fI\--flag=argument\fP. .IP "-a, --append [component]=[data]" Append data to a component. This can only append data to the path and the query components. For path, this URL encodes and appends the new segment to the path, separated with a slash. For query, this URL encodes and appends the new segment to the query, separated with an ampersand (&). If the appended segment contains an equal sign (\fI=\fP) that one is kept verbatim and both sides of the first occurrence are URL encoded separately. .IP --accept-space When set, trurl tries to accept spaces as part of the URL and instead URL encode such occurrences accordingly. According to RFC 3986, a space cannot legally be part of a URL. This option provides a best\-effort to convert the provided string into a valid URL. .IP --as-idn Converts a punycode ASCII hostname to its original International Domain Name in Unicode. If the hostname is not using punycode then the original hostname is used. .IP --curl Only accept URL schemes supported by libcurl. .IP --default-port When set, trurl uses the scheme\(aqs default port number for URLs with a known scheme, and without an explicit port number. Note that trurl only knows default port numbers for URL schemes that are supported by libcurl. Since, by default, trurl removes default port numbers from URLs with a known scheme, this option is pretty much ignored unless one of \fI\--get\fP, \fI\--json\fP, and \fI\--keep\-port\fP is not also specified. .IP "-f, --url-file [filename]" Read URLs to work on from the given file. Use the filename \fI\-\fP (a single minus) to tell trurl to read the URLs from stdin. Each line needs to be a single valid URL. trurl removes one carriage return character at the end of the line if present, trims off all the trailing space and tab characters, and skips all empty (after trimming) lines. The maximum line length supported in a file like this is 4094 bytes. Lines that exceed that length are skipped, and a warning is printed to stderr when they are encountered. .IP "-g, --get [format]" Output text and URL data according to the provided format string. Components from the URL can be output when specified as \fB{component}\fP or \fB[component]\fP, with the name of the part show within curly braces or brackets. You can not mix braces and brackets for this purpose in the same command line. The following component names are available (case sensitive): url, scheme, user, password, options, host, port, path, query, fragment and zoneid. \fB{component}\fP expands to nothing if the given component does not have a value. Components are shown URL decoded by default. URL decoding a component may cause problems to display it. Such problems make a warning get displayed unless \fB\--quiet\fP is used. trurl supports a range of different qualifiers, or prefixes, to the component that changes how it handles it: If \fBurl:\fP is specified, like \fI{url:path}\fP, the component gets output URL encoded. As a shortcut, \fIurl:\fP also works written as a single colon: \fI{:path}\fP. If \fBstrict:\fP is specified, like \fI{strict:path}\fP, URL decode problems are turned into errors. In this stricter mode, a URL decode problem makes trurl stop what it is doing and return with exit code 10. If \fBmust:\fP is specified, like \fI{must:query}\fP, it makes trurl return an error if the requested component does not exist in the URL. By default a missing component will just be shown blank. If \fBdefault:\fP is specified, like \fI{default:url}\fP or \fI{default:port}\fP, and the port is not explicitly specified in the URL, the scheme\(aqs default port is output if it is known. If \fBpuny:\fP is specified, like \fI{puny:url}\fP or \fI{puny:host}\fP, the punycoded version of the hostname is used in the output. This option is mutually exclusive with \fBidn:\fP. If \fBidn:\fP is specified like \fI{idn:url}\fP or \fI{idn:host}\fP, the International Domain Name version of the hostname is used in the output if it is provided as a correctly encoded punycode version. This option is mutually exclusive with \fBpuny:\fP. If \fI\--default\-port\fP is specified, all formats are expanded as if they used \fIdefault:\fP; and if \fI\--punycode\fP is used, all formats are expanded as if they used \fIpuny:\fP. Also note that \fI{url}\fP is affected by the \fI\--keep\-port\fP option. Hosts provided as IPv6 numerical addresses are provided within square brackets. Like \fI[fe80::20c:29ff:fe9c:409b]\fP. Hosts provided as IPv4 numerical addresses are \fInormalized\fP and provided as four dot\-separated decimal numbers when output. You can access specific keys in the query string using the format \fB{query:key}\fP. Then the value of the first matching key is output using a case sensitive match. When extracting a URL decoded query key that contains \fI%00\fP, such octet is replaced with a single period \fI.\fP in the output. You can access specific keys in the query string and out all values using the format \fB{query\-all:key}\fP. This looks for \fIkey\fP case sensitively and outputs all values for that key space\-separated. The \fIformat\fP string supports the following backslash sequences: \\ \- backslash \\t \- tab \\n \- newline \\r \- carriage return \\{ \- an open curly brace that does not start a variable \\[ \- an open bracket that does not start a variable All other text in the format string is shown as\-is. .IP "-h, --help" Show the help output. .IP "--iterate [component]=[item1 item2 ...]" Set the component to multiple values and output the result once for each iteration. Several combined iterations are allowed to generate combinations, but only one \fI\--iterate\fP option per component. The listed items to iterate over should be separated by single spaces. Example: .nf $ trurl example.com --iterate=scheme="ftp https" --iterate=port="22 80" ftp://example.com:22/ ftp://example.com:80/ https://example.com:22/ https://example.com:80/ .fi .IP --json Outputs all set components of the URLs as JSON objects. All components of the URL that have data get populated in the parts object using their component names. See below for details on the format. The URL components are provided URL decoded. Change that with \fB\--urlencode\fP. .IP --keep-port By default, trurl removes default port numbers from URLs with a known scheme even if they are explicitly specified in the input URL. This options, makes trurl not remove them. Example: .nf $ trurl https://example.com:443/ --keep-port https://example.com:443/ .fi .IP --no-guess-scheme Disables libcurl\(aqs scheme guessing feature. URLs that do not contain a scheme are treated as invalid URLs. Example: .nf $ trurl example.com --no-guess-scheme trurl note: Bad scheme [example.com] .fi .IP --punycode Uses the punycode version of the hostname, which is how International Domain Names are converted into plain ASCII. If the hostname is not using IDN, the regular ASCII name is used. Example: .nf $ trurl http://åäö/ --punycode http://xn--4cab6c/ .fi .IP "--qtrim [what]" Trims data off a query. \fIwhat\fP is specified as a full name of a name/value pair, or as a word prefix (using a single trailing asterisk (\fI*\fP)) which makes trurl remove the tuples from the query string that match the instruction. To match a literal trailing asterisk instead of using a wildcard, escape it with a backslash in front of it. Like \fI\\*\fP. .IP "--query-separator [what]" Specify the single letter used for separating query pairs. The default is \fI&\fP but at least in the past sometimes semicolons \fI;\fP or even colons \fI:\fP have been used for this purpose. If your URL uses something other than the default letter, setting the right one makes sure trurl can do its query operations properly. Example: .nf $ trurl "https://curl.se?b=name:a=age" --sort-query --query-separator ":" https://curl.se/?a=age:b=name .fi .IP --quiet Suppress (some) notes and warnings. .IP "--redirect [URL]" Redirect the URL to this new location. The redirection is performed on the base URL, so, if no base URL is specified, no redirection is performed. Example: .nf $ trurl --url https://curl.se/we/are.html --redirect ../here.html https://curl.se/here.html .fi .IP "--replace [data]" Replaces a URL query. data can either take the form of a single value, or as a key/value pair in the shape \fIfoo=bar\fP. If replace is called on an item that is not in the list of queries trurl ignores that item. trurl URL encodes both sides of the \fI=\fP character in the given input data argument. .IP "--replace-append [data]" Works the same as \fI\--replace\fP, but trurl appends a missing query string if it is not in the query list already. .IP "-s, --set [component][:]=[data]" Set this URL component. Setting blank string (\fI""\fP) clears the component from the URL. The following components can be set: url, scheme, user, password, options, host, port, path, query, fragment and zoneid. If a simple \fI=\fP\-assignment is used, the data is URL encoded when applied. If \fI:=\fP is used, the data is assumed to already be URL encoded and stored as\-is. If \fI?=\fP is used, the set is only performed if the component is not already set. It avoids overwriting any already set data. You can also combine \fI:\fP and \fI?\fP into \fI?:=\fP if desired. If no URL or \fI\--url\-file\fP argument is provided, trurl tries to create a URL using the components provided by the \fI\--set\fP options. If not enough components are specified, this fails. .IP --sort-query The "variable=content" tuplets in the query component are sorted in a case insensitive alphabetical order. This helps making URLs identical that otherwise only had their query pairs in different orders. .IP "--trim [component]=[what]" Deprecated: use \fB\--qtrim\fP. Trims data off a component. Currently this can only trim a query component. \fIwhat\fP is specified as a full word or as a word prefix (using a single trailing asterisk (\fI*\fP)) which makes trurl remove the tuples from the query string that match the instruction. To match a literal trailing asterisk instead of using a wildcard, escape it with a backslash in front of it. Like \fI\\*\fP. .IP "--url [URL]" Set the input URL to work with. The URL may be provided without a scheme, which then typically is not actually a legal URL but trurl tries to figure out what is meant and guess what scheme to use (unless \fI\--no\-guess\-scheme\fP is used). Providing multiple URLs makes trurl act on all URLs in a serial fashion. If the URL cannot be parsed for whatever reason, trurl simply moves on to the next provided URL \- unless \fI\--verify\fP is used. .IP --urlencode Outputs URL encoded version of components by default when using \fI\--get\fP or \fI\--json\fP. .IP "-v, --version" Show version information and exit. .IP --verify When a URL is provided, return error immediately if it does not parse as a valid URL. In normal cases, trurl can forgive a bad URL input. .SH URL COMPONENTS .IP scheme This is the leading character sequence of a URL, excluding the "://" separator. It cannot be specified URL encoded. A URL cannot exist without a scheme, but unless \fB\--no\-guess\-scheme\fP is used trurl guesses what scheme that was intended if none was provided. Examples: .nf $ trurl https://odd/ -g '{scheme}' https .fi .nf $ trurl odd -g '{scheme}' http .fi .nf $ trurl odd -g '{scheme}' --no-guess-scheme trurl note: Bad scheme [odd] .fi .IP user After the scheme separator, there can be a username provided. If it ends with a colon (\fI:\fP), there is a password provided. If it ends with an at character (\fI@\fP) there is no password provided in the URL. Example: .nf $ trurl https://user%3a%40:secret@odd/ -g '{user}' user:@ .fi .IP password If the password ends with a semicolon (\fI;\fP) there is an options field following. This field is only accepted by trurl for URLs using the IMAP scheme. Example: .nf $ trurl https://user:secr%65t@odd/ -g '{password}' secret .fi .IP options This field can only end with an at character (\fI@\fP) that separates the options from the hostname. .nf $ trurl 'imap://user:pwd;giraffe@odd' -g '{options}' giraffe .fi If the scheme is not IMAP, the \fIgiraffe\fP part is instead considered part of the password: .nf $ trurl 'sftp://user:pwd;giraffe@odd' -g '{password}' pwd;giraffe .fi We strongly advice users to %\-encode \fI;\fP, \fI:\fP and \fI@\fP in URLs of course to reduce the risk for confusions. .IP host The host component is the hostname or a numerical IP address. If a hostname is provided, it can be an International Domain Name non\-ASCII characters. A hostname can be provided URL encoded. trurl provides options for working with the IDN hostnames either as IDN or in its punycode version. Example, convert an IDN name to punycode in the output: .nf $ trurl http://åäö/ --punycode http://xn--4cab6c/ .fi Or the reverse, convert a punycode hostname into its IDN version: .nf $ trurl http://xn--4cab6c/ --as-idn http://åäö/ .fi If the URL\(aqs hostname starts with an open bracket (\fI[\fP) it is a numerical IPv6 address that also must end with a closing bracket (\fI]\fP). trurl normalizes IPv6 addreses. Example: .nf $ trurl 'http://[2001:9b1:0:0:0:0:7b97:364b]/' http://[2001:9b1::7b97:364b]/ .fi A numerical IPV4 address can be specified using one, two, three or four numbers separated with dots and they can use decimal, octal or hexadecimal. trurl normalizes provided addresses and uses four dotted decimal numbers in its output. Examples: .nf $ trurl http://646464646/ http://38.136.68.134/ .fi .nf $ trurl http://246.646/ http://246.0.2.134/ .fi .nf $ trurl http://246.46.646/ http://246.46.2.134/ .fi .nf $ trurl http://0x14.0xb3022/ http://20.11.48.34/ .fi .IP zoneid If the provided host is an IPv6 address, it might contain a specific zoneid. A number or a network interface name normally. Example: .nf $ trurl 'http://[2001:9b1::f358:1ba4:7b97:364b%enp3s0]/' -g '{zoneid}' enp3s0 .fi .IP port If the host ends with a colon (\fI:\fP) then a port number follows. It is a 16 bit decimal number that may not be URL encoded. trurl knows the default port number for many URL schemes so it can show port numbers for a URL even if none was explicitly used in the URL. With \fB\--default\-port\fP it can add the default port to a URL even when not provide. Example: .nf $ trurl http:/a --default-port http://a:80/ .fi Similarly, trurl normally hides the port number if the given number is the default. Example: .nf $ trurl http:/a:80 http://a/ .fi But a user can make trurl keep the port even if it is the default, with \fB\--keep\-port\fP. Example: .nf $ trurl http:/a:80 --keep-port http://a:80/ .fi .IP path A URL path is assumed to always start with and contain at least a slash (\fI/\fP), even if none is actually provided in the URL. Example: .nf $ trurl http://xn--4cab6c -g '[path]' / .fi When setting the path, trurl will inject a leading slash if none is provided: .nf $ trurl http://hello -s path="pony" http://hello/pony .fi .nf $ trurl http://hello -s path="/pony" http://hello/pony .fi If the input path contains dotdot or dot\-slash sequences, they are normalized away. Example: .nf $ trurl http://hej/one/../two/../three/./four http://hej/three/four .fi You can append a new segment to an existing path with \fB\--append\fP like this: .nf $ trurl http://twelve/three?hello --append path=four http://twelve/three/four?hello .fi .IP query The query part does not include the leading question mark (\fI?\fP) separator when extracted with trurl. Example: .nf $ trurl http://horse?elephant -g '{query}' elephant .fi Example, if you set the query with a leading question mark: .nf $ trurl http://horse?elephant -s "query=?elephant" http://horse/?%3felephant .fi Query parts are often made up of a series of name=value pairs separated with ampersands (\fI&\fP), and trurl offers several ways to work with such. Append a new name value pair to a URL with \fB\--append\fP: .nf $ trurl http://host?name=hello --append query=search=life http://host/?name=hello&search=life .fi You cam \fB\--replace\fP the value of a specific existing name among the pairs: .nf $ trurl 'http://alpha?one=real&two=fake' --replace two=alsoreal http://alpha/?one=real&two=alsoreal .fi If the specific name you want to replace perhaps does not exist in the URL, you can opt to replace \fIor\fP append the pair: .nf $ trurl 'http://alpha?one=real&two=fake' --replace-append three=alsoreal http://alpha/?one=real&two=fake&three=alsoreal .fi In order to perhaps compare two URLs using query name value pairs, sorting them first at least increases the chances of it working: .nf $ trurl "http://alpha/?one=real&two=fake&three=alsoreal" --sort-query http://alpha/?one=real&three=alsoreal&two=fake .fi Remove name/value pairs from the URL by specifying exact name or wildcard pattern with \fB\--qtrim\fP: .nf $ trurl 'https://example.com?a12=hej&a23=moo&b12=foo' --qtrim a*' https://example.com/?b12=foo .fi .IP fragment The fragment part does not include the leading hash sign (\fI#\fP) separator when extracted with trurl. Example: .nf $ trurl http://horse#elephant -g '{fragment}' elephant .fi Example, if you set the fragment with a leading hash sign: .nf $ trurl "http://horse#elephant" -s "fragment=#zebra" http://horse/#%23zebra .fi The fragment part of a URL is for local purposes only. The data in there is never actually sent over the network when a URL is used for transfers. .IP url trurl supports \fBurl\fP as a named component for \fB\--get\fP to allow for more powerful outputs, but of course it is not actually a "component"; it is the full URL. Example: .nf $ trurl ftps://example.com:2021/p%61th -g '{url}' ftps://example.com:2021/path .fi .SH JSON output format The \fI\--json\fP option outputs a JSON array with one or more objects. One for each URL. Each URL JSON object contains a number of properties, a series of key/value pairs. The exact set present depends on the given URL. .IP url This key exists in every object. It is the complete URL. Affected by \fI\--default\-port\fP, \fI\--keep\-port\fP, and \fI\--punycode\fP. .IP parts This key exists in every object, and contains an object with a key for each of the settable URL components. If a component is missing, it means it is not present in the URL. The parts are URL decoded unless \fI\--urlencode\fP is used. .IP parts.scheme The URL scheme. .IP parts.user The username. .IP parts.password The password. .IP parts.options The options. Note that only a few URL schemes support the "options" component. .IP parts.host The normalized hostname. It might be a UTF\-8 name if an IDN name was used. It can also be a normalized IPv4 or IPv6 address. An IPv6 address always starts with a bracket (\fB[\fP) \- and no other hostnames can contain such a symbol. If \fI\--punycode\fP is used, the punycode version of the host is outputted instead. .IP parts.port The provided port number as a string. If the port number was not provided in the URL, but the scheme is a known one, and \fI\--default\-port\fP is in use, the default port for that scheme is provided here. .IP parts.path The path. Including the leading slash. .IP parts.query The full query, excluding the question mark separator. .IP parts.fragment The fragment, excluding the pound sign separator. .IP parts.zoneid The zone id, which can only be present in an IPv6 address. When this key is present, then \fBhost\fP is an IPv6 numerical address. .IP params This key contains an array of query key/value objects. Each such pair is listed with "key" and "value" and their respective contents in the output. The key/values are extracted from the query where they are separated by ampersands (\fB&\fP) \- or the user sets with \fB\--query\-separator\fP. The query pairs are listed in the order of appearance in a left\-to\-right order, but can be made alpha\-sorted with \fB\--sort\-query\fP. It is only present if the URL has a query. .SH EXAMPLES .IP "Replace the hostname of a URL" .nf $ trurl --url https://curl.se --set host=example.com https://example.com/ .fi .IP "Create a URL by setting components" .nf $ trurl --set host=example.com --set scheme=ftp ftp://example.com/ .fi .IP "Redirect a URL" .nf $ trurl --url https://curl.se/we/are.html --redirect here.html https://curl.se/we/here.html .fi .IP "Change port number" This also shows how trurl removes dot\-dot sequences .nf $ trurl --url https://curl.se/we/../are.html --set port=8080 https://curl.se:8080/are.html .fi .IP "Extract the path from a URL" .nf $ trurl --url https://curl.se/we/are.html --get '{path}' /we/are.html .fi .IP "Extract the port from a URL" This gets the default port based on the scheme if the port is not set in the URL. .nf $ trurl --url https://curl.se/we/are.html --get '{default:port}' 443 .fi .IP "Append a path segment to a URL" .nf $ trurl --url https://curl.se/hello --append path=you https://curl.se/hello/you .fi .IP "Append a query segment to a URL" .nf $ trurl --url "https://curl.se?name=hello" --append query=search=string https://curl.se/?name=hello&search=string .fi .IP "Read URLs from stdin" .nf $ cat urllist.txt | trurl --url-file - \\&... .fi .IP "Output JSON" .nf $ trurl "https://fake.host/search?q=answers&user=me#frag" --json [ { "url": "https://fake.host/search?q=answers&user=me#frag", "parts": [ "scheme": "https", "host": "fake.host", "path": "/search", "query": "q=answers&user=me" "fragment": "frag", ], "params": [ { "key": "q", "value": "answers" }, { "key": "user", "value": "me" } ] } ] .fi .IP "Remove tracking tuples from query" .nf $ trurl "https://curl.se?search=hey&utm_source=tracker" --qtrim "utm_*" https://curl.se/?search=hey .fi .IP "Show a specific query key value" .nf $ trurl "https://example.com?a=home&here=now&thisthen" -g '{query:a}' home .fi .IP "Sort the key/value pairs in the query component" .nf $ trurl "https://example.com?b=a&c=b&a=c" --sort-query https://example.com?a=c&b=a&c=b .fi .IP "Work with a query that uses a semicolon separator" .nf $ trurl "https://curl.se?search=fool;page=5" --qtrim "search" --query-separator ";" https://curl.se?page=5 .fi .IP "Accept spaces in the URL path" .nf $ trurl "https://curl.se/this has space/index.html" --accept-space https://curl.se/this%20has%20space/index.html .fi .IP "Create multiple variations of a URL with different schemes" .nf $ trurl "https://curl.se/path/index.html" --iterate "scheme=http ftp sftp" http://curl.se/path/index.html ftp://curl.se/path/index.html sftp://curl.se/path/index.html .fi .SH EXIT CODES trurl returns a non\-zero exit code to indicate problems. .IP 1 A problem with \--url\-file .IP 2 A problem with \--append .IP 3 A command line option misses an argument .IP 4 A command line option mistake or an illegal option combination. .IP 5 A problem with \--set .IP 6 Out of memory .IP 7 Could not output a valid URL .IP 8 A problem with \--qtrim .IP 9 If \--verify is set and the input URL cannot parse. .IP 10 A problem with \--get .IP 11 A problem with \--iterate .IP 12 A problem with \--replace or \--replace\-append .SH WWW https://curl.se/trurl .SH SEE ALSO .BR curl (1), .BR wcurl (1) trurl-0.16.1/trurl.c0000664000000000000000000016033115010312005011202 0ustar00/*************************************************************************** * _ _ * Project | |_ _ __ _ _ _ __| | * | __| '__| | | | '__| | * | |_| | | |_| | | | | * \__|_| \__,_|_| |_| * * Copyright (C) Daniel Stenberg, , et al. * * This software is licensed as described in the file COPYING, which * you should have received as part of this distribution. The terms * are also available at https://curl.se/docs/copyright.html. * * You may opt to use, copy, modify, merge, publish, distribute and/or sell * copies of the Software, and permit persons to whom the Software is * furnished to do so, under the terms of the COPYING file. * * This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY * KIND, either express or implied. * * SPDX-License-Identifier: curl * ***************************************************************************/ #include #include #include #include #include #include #include #if defined(_MSC_VER) && (_MSC_VER < 1800) typedef enum { bool_false = 0, bool_true = 1 } bool; #define false bool_false #define true bool_true #else #include #endif #include /* for setlocale() */ #include "version.h" #ifdef _MSC_VER #define strdup _strdup #endif #if CURL_AT_LEAST_VERSION(7,77,0) #define SUPPORTS_NORM_IPV4 #endif #if CURL_AT_LEAST_VERSION(7,81,0) #define SUPPORTS_ZONEID #endif #if CURL_AT_LEAST_VERSION(7,80,0) #define SUPPORTS_URL_STRERROR #endif #if CURL_AT_LEAST_VERSION(7,78,0) #define SUPPORTS_ALLOW_SPACE #else #define CURLU_ALLOW_SPACE 0 #endif #if CURL_AT_LEAST_VERSION(7,88,0) #define SUPPORTS_PUNYCODE #endif #if CURL_AT_LEAST_VERSION(8,3,0) #define SUPPORTS_PUNY2IDN #endif #if CURL_AT_LEAST_VERSION(7,30,0) #define SUPPORTS_IMAP_OPTIONS #endif #if CURL_AT_LEAST_VERSION(8,9,0) #define SUPPORTS_NO_GUESS_SCHEME #else #define CURLU_NO_GUESS_SCHEME 0 #endif #if CURL_AT_LEAST_VERSION(8,8,0) #define SUPPORTS_GET_EMPTY #else #define CURLU_GET_EMPTY 0 #endif #define OUTPUT_URL 0 /* default */ #define OUTPUT_SCHEME 1 #define OUTPUT_USER 2 #define OUTPUT_PASSWORD 3 #define OUTPUT_OPTIONS 4 #define OUTPUT_HOST 5 #define OUTPUT_PORT 6 #define OUTPUT_PATH 7 #define OUTPUT_QUERY 8 #define OUTPUT_FRAGMENT 9 #define OUTPUT_ZONEID 10 #define NUM_COMPONENTS 10 /* excluding "url" */ #define PROGNAME "trurl" #define REPLACE_NULL_BYTE '.' /* for query:key extractions */ enum { VARMODIFIER_URLENCODED = 1 << 1, VARMODIFIER_DEFAULT = 1 << 2, VARMODIFIER_PUNY = 1 << 3, VARMODIFIER_PUNY2IDN = 1 << 4, VARMODIFIER_EMPTY = 1 << 8, }; struct var { const char *name; CURLUPart part; }; struct string { char *str; size_t len; }; static const struct var variables[] = { {"scheme", CURLUPART_SCHEME}, {"user", CURLUPART_USER}, {"password", CURLUPART_PASSWORD}, {"options", CURLUPART_OPTIONS}, {"host", CURLUPART_HOST}, {"port", CURLUPART_PORT}, {"path", CURLUPART_PATH}, {"query", CURLUPART_QUERY}, {"fragment", CURLUPART_FRAGMENT}, {"zoneid", CURLUPART_ZONEID}, {NULL, 0} }; #define ERROR_PREFIX PROGNAME " error: " #define WARN_PREFIX PROGNAME " note: " /* error codes */ #define ERROR_FILE 1 #define ERROR_APPEND 2 /* --append mistake */ #define ERROR_ARG 3 /* a command line option misses its argument */ #define ERROR_FLAG 4 /* a command line flag mistake */ #define ERROR_SET 5 /* a --set problem */ #define ERROR_MEM 6 /* out of memory */ #define ERROR_URL 7 /* could not get a URL out of the set components */ #define ERROR_TRIM 8 /* a --qtrim problem */ #define ERROR_BADURL 9 /* if --verify is set and the URL cannot parse */ #define ERROR_GET 10 /* bad --get syntax */ #define ERROR_ITER 11 /* bad --iterate syntax */ #define ERROR_REPL 12 /* a --replace problem */ #ifndef SUPPORTS_URL_STRERROR /* provide a fake local mockup */ static char *curl_url_strerror(CURLUcode error) { static char buffer[128]; curl_msnprintf(buffer, sizeof(buffer), "URL error %u", (int)error); return buffer; } #endif /* Mapping table to go from lowercase to uppercase for plain ASCII.*/ static const unsigned char touppermap[256] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255 }; /* Portable, ASCII-consistent toupper. Do not use toupper() because its behavior is altered by the current locale. */ #define raw_toupper(in) touppermap[(unsigned int)in] /* * casecompare() does ASCII based case insensitive checks, as a strncasecmp * replacement. */ static int casecompare(const char *first, const char *second, size_t max) { while(*first && *second && max) { int diff = raw_toupper(*first) - raw_toupper(*second); if(diff) /* get out of the loop as soon as they don't match */ return diff; max--; first++; second++; } if(!max) return 0; /* identical to this point */ return raw_toupper(*first) - raw_toupper(*second); } static void message_low(const char *prefix, const char *suffix, const char *fmt, va_list ap) { fputs(prefix, stderr); vfprintf(stderr, fmt, ap); fputs(suffix, stderr); } static void warnf_low(const char *fmt, va_list ap) { message_low(WARN_PREFIX, "\n", fmt, ap); } static void warnf(const char *fmt, ...) { va_list ap; va_start(ap, fmt); warnf_low(fmt, ap); va_end(ap); } static void help(void) { int i; fputs( "Usage: " PROGNAME " [options] [URL]\n" " -a, --append [component]=[data] - append data to component\n" " --accept-space - give in to this URL abuse\n" " --as-idn - encode hostnames in idn\n" " --curl - only schemes supported by libcurl\n" " --default-port - add known default ports\n" " -f, --url-file [file/-] - read URLs from file or stdin\n" " -g, --get [{component}s] - output component(s)\n" " -h, --help - this help\n" " --iterate [component]=[list] - create multiple URL outputs\n" " --json - output URL as JSON\n" " --keep-port - keep known default ports\n" " --no-guess-scheme - require scheme in URLs\n" " --punycode - encode hostnames in punycode\n" " --qtrim [what] - trim the query\n" " --query-separator [letter] - if something else than '&'\n" " --quiet - Suppress (some) notes and comments\n" " --redirect [URL] - redirect to this\n" " --replace [data] - replaces a query [data]\n" " --replace-append [data] - appends a new query if not found\n" " -s, --set [component]=[data] - set component content\n" " --sort-query - alpha-sort the query pairs\n" " --url [URL] - URL to work with\n" " --urlencode - show components URL encoded\n" " -v, --version - show version\n" " --verify - return error on (first) bad URL\n" " URL COMPONENTS:\n" " ", stdout); fputs("url, ", stdout); for(i = 0; i< NUM_COMPONENTS ; i++) { printf("%s%s", i?", ":"", variables[i].name); } fputs("\n", stdout); exit(0); } static void show_version(void) { curl_version_info_data *data = curl_version_info(CURLVERSION_NOW); /* puny code isn't guaranteed based on the version, so it must be polled * from libcurl */ #if defined(SUPPORTS_PUNYCODE) || defined(SUPPORTS_PUNY2IDN) bool supports_puny = (data->features & CURL_VERSION_IDN) != 0; #endif #if defined(SUPPORTS_IMAP_OPTIONS) bool supports_imap = false; const char *const *protocol_name = data->protocols; while(*protocol_name && !supports_imap) { supports_imap = !strncmp(*protocol_name, "imap", 3); protocol_name++; } #endif fprintf(stdout, "%s version %s libcurl/%s [built-with %s]\n", PROGNAME, TRURL_VERSION_TXT, data->version, LIBCURL_VERSION); fprintf(stdout, "features:"); #ifdef SUPPORTS_GET_EMPTY fprintf(stdout, " get-empty"); #endif #ifdef SUPPORTS_IMAP_OPTIONS if(supports_imap) fprintf(stdout, " imap-options"); #endif #ifdef SUPPORTS_NO_GUESS_SCHEME fprintf(stdout, " no-guess-scheme"); #endif #ifdef SUPPORTS_NORM_IPV4 fprintf(stdout, " normalize-ipv4"); #endif #ifdef SUPPORTS_PUNYCODE if(supports_puny) fprintf(stdout, " punycode"); #endif #ifdef SUPPORTS_PUNY2IDN if(supports_puny) fprintf(stdout, " punycode2idn"); #endif #ifdef SUPPORTS_URL_STRERROR fprintf(stdout, " url-strerror"); #endif #ifdef SUPPORTS_ALLOW_SPACE fprintf(stdout, " white-space"); #endif #ifdef SUPPORTS_ZONEID fprintf(stdout, " zone-id"); #endif fprintf(stdout, "\n"); exit(0); } struct iterinfo { CURLU *uh; const char *part; size_t plen; char *ptr; unsigned int varmask; /* sets 1 << [component] */ }; struct option { struct curl_slist *url_list; struct curl_slist *append_path; struct curl_slist *append_query; struct curl_slist *set_list; struct curl_slist *trim_list; struct curl_slist *iter_list; struct curl_slist *replace_list; const char *redirect; const char *qsep; const char *format; FILE *url; bool urlopen; bool jsonout; bool verify; bool accept_space; bool curl; bool default_port; bool keep_port; bool punycode; bool puny2idn; bool sort_query; bool no_guess_scheme; bool urlencode; bool end_of_options; bool quiet_warnings; bool force_replace; /* -- stats -- */ unsigned int urls; }; static void trurl_warnf(struct option *o, const char *fmt, ...) { if(!o->quiet_warnings) { va_list ap; va_start(ap, fmt); fputs(WARN_PREFIX, stderr); vfprintf(stderr, fmt, ap); fputs("\n", stderr); va_end(ap); } } #define MAX_QPAIRS 1000 struct string qpairs[MAX_QPAIRS]; /* encoded */ struct string qpairsdec[MAX_QPAIRS]; /* decoded */ int nqpairs; /* how many is stored */ static void trurl_cleanup_options(struct option *o) { if(!o) return; curl_slist_free_all(o->url_list); curl_slist_free_all(o->set_list); curl_slist_free_all(o->iter_list); curl_slist_free_all(o->append_query); curl_slist_free_all(o->trim_list); curl_slist_free_all(o->replace_list); curl_slist_free_all(o->append_path); } static void errorf_low(const char *fmt, va_list ap) { message_low(ERROR_PREFIX, "\n" ERROR_PREFIX "Try " PROGNAME " -h for help\n", fmt, ap); } static void errorf(struct option *o, int exit_code, const char *fmt, ...) { va_list ap; va_start(ap, fmt); errorf_low(fmt, ap); va_end(ap); trurl_cleanup_options(o); curl_global_cleanup(); exit(exit_code); } static char *xstrdup(struct option *o, const char *ptr) { char *temp = strdup(ptr); if(!temp) errorf(o, ERROR_MEM, "out of memory"); return temp; } static void verify(struct option *o, int exit_code, const char *fmt, ...) { va_list ap; va_start(ap, fmt); if(!o->verify) { warnf_low(fmt, ap); va_end(ap); } else { /* make sure to terminate the JSON array */ if(o->jsonout) printf("%s]\n", o->urls ? "\n" : ""); errorf_low(fmt, ap); va_end(ap); trurl_cleanup_options(o); curl_global_cleanup(); exit(exit_code); } } static char *strurldecode(const char *url, int inlength, int *outlength) { return curl_easy_unescape(NULL, inlength ? url : "", inlength, outlength); } static void urladd(struct option *o, const char *url) { struct curl_slist *n; n = curl_slist_append(o->url_list, url); if(n) o->url_list = n; } /* read URLs from this file/stdin */ static void urlfile(struct option *o, const char *file) { FILE *f; if(o->url) errorf(o, ERROR_FLAG, "only one --url-file is supported"); if(strcmp("-", file)) { f = fopen(file, "rt"); if(!f) errorf(o, ERROR_FILE, "--url-file %s not found", file); o->urlopen = true; } else f = stdin; o->url = f; } static void pathadd(struct option *o, const char *path) { struct curl_slist *n; char *urle = curl_easy_escape(NULL, path, 0); if(urle) { n = curl_slist_append(o->append_path, urle); if(n) { o->append_path = n; } curl_free(urle); } } static char *encodeassign(const char *query) { char *p = strchr(query, '='); char *urle; if(p) { /* URL encode the left and the right side of the '=' separately */ char *f1 = curl_easy_escape(NULL, query, (int)(p - query)); char *f2 = curl_easy_escape(NULL, p + 1, 0); urle = curl_maprintf("%s=%s", f1, f2); curl_free(f1); curl_free(f2); } else urle = curl_easy_escape(NULL, query, 0); return urle; } static void queryadd(struct option *o, const char *query) { char *urle = encodeassign(query); if(urle) { struct curl_slist *n = curl_slist_append(o->append_query, urle); if(n) o->append_query = n; curl_free(urle); } } static void appendadd(struct option *o, const char *arg) { if(!strncmp("path=", arg, 5)) pathadd(o, arg + 5); else if(!strncmp("query=", arg, 6)) queryadd(o, arg + 6); else errorf(o, ERROR_APPEND, "--append unsupported component: %s", arg); } static void setadd(struct option *o, const char *set) /* [component]=[data] */ { struct curl_slist *n; n = curl_slist_append(o->set_list, set); if(n) o->set_list = n; } static void iteradd(struct option *o, const char *iter) /* [component]=[data] */ { struct curl_slist *n; n = curl_slist_append(o->iter_list, iter); if(n) o->iter_list = n; } static void trimadd(struct option *o, const char *trim) /* [component]=[data] */ { struct curl_slist *n; n = curl_slist_append(o->trim_list, trim); if(n) o->trim_list = n; } static void replaceadd(struct option *o, const char *replace_list) /* [component]=[data] */ { if(replace_list) { char *urle = encodeassign(replace_list); if(urle) { struct curl_slist *n = curl_slist_append(o->replace_list, urle); if(n) o->replace_list = n; curl_free(urle); } } else errorf(o, ERROR_REPL, "No data passed to replace component"); } static bool longarg(const char *flag, const char *check) { /* the given flag might end with an equals sign */ size_t len = strlen(flag); return (!strcmp(flag, check) || (!strncmp(flag, check, len) && check[len] == '=')); } static bool checkoptarg(struct option *o, const char *flag, const char *given, const char *arg) { bool shortopt = false; if((flag[0] == '-') && (flag[1] != '-')) shortopt = true; if((!shortopt && longarg(flag, given)) || (!strncmp(flag, given, 2) && shortopt)) { if(!arg) errorf(o, ERROR_ARG, "Missing argument for %s", flag); return true; } return false; } static int getarg(struct option *o, const char *flag, const char *arg, bool *usedarg) { bool gap = true; *usedarg = false; if((flag[0] == '-') && (flag[1] != '-') && flag[2]) { arg = (char *)&flag[2]; gap = false; } else if((flag[0] == '-') && (flag[1] == '-')) { char *equals = strchr(&flag[2], '='); if(equals) { arg = (char *)&equals[1]; gap = false; } } if(!strcmp("--", flag)) o->end_of_options = true; else if(!strcmp("-v", flag) || !strcmp("--version", flag)) show_version(); else if(!strcmp("-h", flag) || !strcmp("--help", flag)) help(); else if(checkoptarg(o, "--url", flag, arg)) { urladd(o, arg); *usedarg = gap; } else if(checkoptarg(o, "-f", flag, arg) || checkoptarg(o, "--url-file", flag, arg)) { urlfile(o, arg); *usedarg = gap; } else if(checkoptarg(o, "-a", flag, arg) || checkoptarg(o, "--append", flag, arg)) { appendadd(o, arg); *usedarg = gap; } else if(checkoptarg(o, "-s", flag, arg) || checkoptarg(o, "--set", flag, arg)) { setadd(o, arg); *usedarg = gap; } else if(checkoptarg(o, "--iterate", flag, arg)) { iteradd(o, arg); *usedarg = gap; } else if(checkoptarg(o, "--redirect", flag, arg)) { if(o->redirect) errorf(o, ERROR_FLAG, "only one --redirect is supported"); o->redirect = arg; *usedarg = gap; } else if(checkoptarg(o, "--query-separator", flag, arg)) { if(o->qsep) errorf(o, ERROR_FLAG, "only one --query-separator is supported"); if(strlen(arg) != 1) errorf(o, ERROR_FLAG, "only single-letter query separators are supported"); o->qsep = arg; *usedarg = gap; } else if(checkoptarg(o, "--trim", flag, arg)) { if(strncmp(arg, "query=", 6)) errorf(o, ERROR_TRIM, "Unsupported trim component: %s", arg); trimadd(o, &arg[6]); *usedarg = gap; } else if(checkoptarg(o, "--qtrim", flag, arg)) { trimadd(o, arg); *usedarg = gap; } else if(checkoptarg(o, "-g", flag, arg) || checkoptarg(o, "--get", flag, arg)) { if(o->format) errorf(o, ERROR_FLAG, "only one --get is supported"); if(o->jsonout) errorf(o, ERROR_FLAG, "--get is mutually exclusive with --json"); o->format = arg; *usedarg = gap; } else if(!strcmp("--json", flag)) { if(o->format) errorf(o, ERROR_FLAG, "--json is mutually exclusive with --get"); o->jsonout = true; } else if(!strcmp("--verify", flag)) o->verify = true; else if(!strcmp("--accept-space", flag)) { #ifdef SUPPORTS_ALLOW_SPACE o->accept_space = true; #else trurl_warnf(o, "built with too old libcurl version, --accept-space does not work"); #endif } else if(!strcmp("--curl", flag)) o->curl = true; else if(!strcmp("--default-port", flag)) o->default_port = true; else if(!strcmp("--keep-port", flag)) o->keep_port = true; else if(!strcmp("--punycode", flag)) { if(o->puny2idn) errorf(o, ERROR_FLAG, "--punycode is mutually exclusive with --as-idn"); o->punycode = true; } else if(!strcmp("--as-idn", flag)) { if(o->punycode) errorf(o, ERROR_FLAG, "--as-idn is mutually exclusive with --punycode"); o->puny2idn = true; } else if(!strcmp("--no-guess-scheme", flag)) o->no_guess_scheme = true; else if(!strcmp("--sort-query", flag)) o->sort_query = true; else if(!strcmp("--urlencode", flag)) o->urlencode = true; else if(!strcmp("--quiet", flag)) o->quiet_warnings = true; else if(!strcmp("--replace", flag)) { replaceadd(o, arg); *usedarg = gap; } else if(!strcmp("--replace-append", flag) || !strcmp("--force-replace", flag)) { /* the initial name */ replaceadd(o, arg); o->force_replace = true; *usedarg = gap; } else return 1; /* unrecognized option */ return 0; } static void showqkey(FILE *stream, const char *key, size_t klen, bool urldecode, bool showall) { int i; bool shown = false; struct string *qp = urldecode ? qpairsdec : qpairs; for(i = 0; i< nqpairs; i++) { if(!strncmp(key, qp[i].str, klen) && (qp[i].str[klen] == '=')) { if(shown) fputc(' ', stream); fprintf(stream, "%.*s", (int) (qp[i].len - klen - 1), &qp[i].str[klen + 1]); if(!showall) break; shown = true; } } } /* component to variable pointer */ static const struct var *comp2var(const char *name, size_t vlen) { int i; for(i = 0; variables[i].name; i++) if((strlen(variables[i].name) == vlen) && !strncmp(name, variables[i].name, vlen)) return &variables[i]; return NULL; } static CURLUcode geturlpart(struct option *o, int modifiers, CURLU *uh, CURLUPart part, char **out) { CURLUcode rc = curl_url_get(uh, part, out, (((modifiers & VARMODIFIER_DEFAULT) || o->default_port) ? CURLU_DEFAULT_PORT : ((part != CURLUPART_URL || o->keep_port) ? 0 : CURLU_NO_DEFAULT_PORT))| #ifdef SUPPORTS_PUNYCODE (((modifiers & VARMODIFIER_PUNY) || o->punycode) ? CURLU_PUNYCODE : 0)| #endif #ifdef SUPPORTS_PUNY2IDN (((modifiers & VARMODIFIER_PUNY2IDN) || o->puny2idn) ? CURLU_PUNY2IDN : 0) | #endif #ifdef SUPPORTS_GET_EMPTY ((modifiers & VARMODIFIER_EMPTY) ? CURLU_GET_EMPTY : 0) | #endif (o->curl ? 0 : CURLU_NON_SUPPORT_SCHEME)| (((modifiers & VARMODIFIER_URLENCODED) || o->urlencode) ? 0 :CURLU_URLDECODE)); #ifdef SUPPORTS_PUNY2IDN /* retry get w/ out puny2idn to handle invalid punycode conversions */ if(rc == CURLUE_BAD_HOSTNAME && (o->puny2idn || (modifiers & VARMODIFIER_PUNY2IDN))) { curl_free(*out); modifiers &= ~VARMODIFIER_PUNY2IDN; o->puny2idn = false; trurl_warnf(o, "Error converting url to IDN [%s]", curl_url_strerror(rc)); return geturlpart(o, modifiers, uh, part, out); } #endif return rc; } static bool is_valid_trurl_error(CURLUcode rc) { switch(rc) { case CURLUE_OK: case CURLUE_NO_SCHEME: case CURLUE_NO_USER: case CURLUE_NO_PASSWORD: case CURLUE_NO_OPTIONS: case CURLUE_NO_HOST: case CURLUE_NO_PORT: case CURLUE_NO_QUERY: case CURLUE_NO_FRAGMENT: #ifdef SUPPORTS_ZONEID case CURLUE_NO_ZONEID: #endif /* silently ignore */ return false; default: return true; } return true; } static void showurl(FILE *stream, struct option *o, int modifiers, CURLU *uh) { char *url; CURLUcode rc = geturlpart(o, modifiers, uh, CURLUPART_URL, &url); if(rc) { trurl_cleanup_options(o); verify(o, ERROR_BADURL, "invalid url [%s]", curl_url_strerror(rc)); return; } fputs(url, stream); curl_free(url); } static void get(struct option *o, CURLU *uh) { FILE *stream = stdout; const char *ptr = o->format; bool done = false; char startbyte = 0; char endbyte = 0; while(ptr && *ptr && !done) { if(!startbyte && (('{' == *ptr) || ('[' == *ptr))) { startbyte = *ptr; if('{' == *ptr) endbyte = '}'; else endbyte = ']'; } if(startbyte == *ptr) { if(startbyte == ptr[1]) { /* an escaped {-letter */ fputc(startbyte, stream); ptr += 2; } else { /* this is meant as a variable to output */ const char *start = ptr; char *end; char *cl; size_t vlen; bool isquery = false; bool queryall = false; bool strict = false; /* strict mode, fail on URL decode problems */ bool must = false; /* must mode, fail on missing component */ int mods = 0; end = strchr(ptr, endbyte); ptr++; /* pass the { */ if(!end) { /* syntax error */ fputc(startbyte, stream); continue; } /* {path} {:path} {/path} */ if(*ptr == ':') { mods |= VARMODIFIER_URLENCODED; ptr++; } vlen = end - ptr; do { size_t wordlen; cl = memchr(ptr, ':', vlen); if(!cl) break; wordlen = cl - ptr + 1; /* modifiers! */ if(!strncmp(ptr, "default:", wordlen)) mods |= VARMODIFIER_DEFAULT; else if(!strncmp(ptr, "puny:", wordlen)) { if(mods & VARMODIFIER_PUNY2IDN) errorf(o, ERROR_GET, "puny modifier is mutually exclusive with idn"); mods |= VARMODIFIER_PUNY; } else if(!strncmp(ptr, "idn:", wordlen)) { if(mods & VARMODIFIER_PUNY) errorf(o, ERROR_GET, "idn modifier is mutually exclusive with puny"); mods |= VARMODIFIER_PUNY2IDN; } else if(!strncmp(ptr, "strict:", wordlen)) strict = true; else if(!strncmp(ptr, "must:", wordlen)) { must = true; mods |= VARMODIFIER_EMPTY; } else if(!strncmp(ptr, "url:", wordlen)) mods |= VARMODIFIER_URLENCODED; else { if(!strncmp(ptr, "query-all:", wordlen)) { isquery = true; queryall = true; } else if(!strncmp(ptr, "query:", wordlen)) isquery = true; else { /* syntax error */ vlen = 0; end[1] = '\0'; } break; } ptr = cl + 1; vlen = end - ptr; } while(true); if(isquery) { showqkey(stream, cl + 1, end - cl - 1, !o->urlencode && !(mods & VARMODIFIER_URLENCODED), queryall); } else if(!vlen) errorf(o, ERROR_GET, "Bad --get syntax: %s", start); else if(!strncmp(ptr, "url", vlen)) showurl(stream, o, mods, uh); else { const struct var *v = comp2var(ptr, vlen); if(v) { char *nurl; /* ask for it URL encode always, to avoid libcurl warning on content */ CURLUcode rc = geturlpart(o, mods | VARMODIFIER_URLENCODED, uh, v->part, &nurl); if(!rc && !(mods & VARMODIFIER_URLENCODED) && !o->urlencode) { /* it should not be encoded in the output */ int olen; char *dec = curl_easy_unescape(NULL, nurl, 0, &olen); curl_free(nurl); if(memchr(dec, '\0', (size_t)olen)) { /* a binary zero cannot be shown */ rc = CURLUE_URLDECODE; curl_free(dec); dec = NULL; } nurl = dec; } if(rc == CURLUE_OK) { fputs(nurl, stream); curl_free(nurl); } else if(!is_valid_trurl_error(rc) && must) errorf(o, ERROR_GET, "missing must:%s", v->name); else if(is_valid_trurl_error(rc) || strict) { if((rc == CURLUE_URLDECODE) && strict) errorf(o, ERROR_GET, "problems URL decoding %s", v->name); else trurl_warnf(o, "%s (%s)", curl_url_strerror(rc), v->name); } } else errorf(o, ERROR_GET, "\"%.*s\" is not a recognized URL component", (int)vlen, ptr); } ptr = end + 1; /* pass the end */ } } else if('\\' == *ptr && ptr[1]) { switch(ptr[1]) { case 'r': fputc('\r', stream); break; case 'n': fputc('\n', stream); break; case 't': fputc('\t', stream); break; case '\\': fputc('\\', stream); break; case '{': fputc('{', stream); break; case '[': fputc('[', stream); break; default: /* unknown, just output this */ fputc(*ptr, stream); fputc(ptr[1], stream); break; } ptr += 2; } else { fputc(*ptr, stream); ptr++; } } fputc('\n', stream); } static const struct var *setone(CURLU *uh, const char *setline, struct option *o) { char *ptr = strchr(setline, '='); const struct var *v = NULL; if(ptr && (ptr > setline)) { size_t vlen = ptr - setline; bool urlencode = true; bool conditional = false; bool found = false; if(vlen) { int back = -1; size_t reqlen = 1; while(vlen > reqlen) { if(ptr[back] == ':') { urlencode = false; vlen--; } else if(ptr[back] == '?') { conditional = true; vlen--; } else break; reqlen++; back--; } } v = comp2var(setline, vlen); if(v) { CURLUcode rc = CURLUE_OK; bool skip = false; if((v->part == CURLUPART_HOST) && ('[' == ptr[1])) /* when setting an IPv6 numerical address, disable URL encoding */ urlencode = false; if(conditional) { char *piece; rc = curl_url_get(uh, v->part, &piece, CURLU_NO_GUESS_SCHEME); if(!rc) { skip = true; curl_free(piece); } } if(!skip) rc = curl_url_set(uh, v->part, ptr[1] ? &ptr[1] : NULL, (o->curl ? 0 : CURLU_NON_SUPPORT_SCHEME)| (urlencode ? CURLU_URLENCODE : 0) ); if(rc) warnf("Error setting %s: %s", v->name, curl_url_strerror(rc)); found = true; } if(!found) errorf(o, ERROR_SET, "unknown component: %.*s", (int)vlen, setline); } else errorf(o, ERROR_SET, "invalid --set syntax: %s", setline); return v; } static unsigned int set(CURLU *uh, struct option *o) { struct curl_slist *node; unsigned int mask = 0; for(node = o->set_list; node; node = node->next) { const struct var *v; char *setline = node->data; v = setone(uh, setline, o); if(v) { if(mask & (1 << v->part)) errorf(o, ERROR_SET, "duplicate --set for component %s", v->name); mask |= (1 << v->part); } } return mask; /* the set components */ } static void jsonString(FILE *stream, const char *in, size_t len, bool lowercase) { const unsigned char *i = (unsigned char *)in; const char *in_end = &in[len]; fputc('\"', stream); for(; i < (unsigned char *)in_end; i++) { switch(*i) { case '\\': fputs("\\\\", stream); break; case '\"': fputs("\\\"", stream); break; case '\b': fputs("\\b", stream); break; case '\f': fputs("\\f", stream); break; case '\n': fputs("\\n", stream); break; case '\r': fputs("\\r", stream); break; case '\t': fputs("\\t", stream); break; default: if(*i < 32) fprintf(stream, "\\u%04x", *i); else { char out = *i; if(lowercase && (out >= 'A' && out <= 'Z')) /* do not use tolower() since that's locale specific */ out |= ('a' - 'A'); fputc(out, stream); } break; } } fputc('\"', stream); } static void json(struct option *o, CURLU *uh) { int i; bool first = true; char *url; CURLUcode rc = geturlpart(o, 0, uh, CURLUPART_URL, &url); if(rc) { trurl_cleanup_options(o); verify(o, ERROR_BADURL, "invalid url [%s]", curl_url_strerror(rc)); return; } printf("%s\n {\n \"url\": ", o->urls ? "," : ""); jsonString(stdout, url, strlen(url), false); curl_free(url); fputs(",\n \"parts\": {\n", stdout); /* special error handling required to not print params array. */ bool params_errors = false; for(i = 0; variables[i].name; i++) { char *part; /* ask for the URL encoded version so that weird control characters do not cause problems. URL decode it when push to json. */ rc = geturlpart(o, VARMODIFIER_URLENCODED, uh, variables[i].part, &part); if(!rc) { int olen; char *dec = NULL; if(!o->urlencode) { if(variables[i].part == CURLUPART_QUERY) { /* query parts have '+' for space */ char *n; char *p = part; do { n = strchr(p, '+'); if(n) { *n = ' '; p = n + 1; } } while(n); } dec = curl_easy_unescape(NULL, part, 0, &olen); if(!dec) errorf(o, ERROR_MEM, "out of memory"); } if(!first) fputs(",\n", stdout); first = false; printf(" \"%s\": ", variables[i].name); if(dec) jsonString(stdout, dec, (size_t)olen, false); else jsonString(stdout, part, strlen(part), false); curl_free(part); curl_free(dec); } else if(is_valid_trurl_error(rc)) { trurl_warnf(o, "%s (%s)", curl_url_strerror(rc), variables[i].name); params_errors = true; } } fputs("\n }", stdout); first = true; if(nqpairs && !params_errors) { int j; fputs(",\n \"params\": [\n", stdout); for(j = 0 ; j < nqpairs; j++) { const char *sep = memchr(qpairsdec[j].str, '=', qpairsdec[j].len); const char *value = sep ? sep + 1 : ""; int value_len = (int) qpairsdec[j].len - (int)(value - qpairsdec[j].str); /* don't print out empty/trimmed values */ if(!qpairsdec[j].len || !qpairsdec[j].str[0]) continue; if(!first) fputs(",\n", stdout); first = false; fputs(" {\n \"key\": ", stdout); jsonString(stdout, qpairsdec[j].str, sep ? (size_t)(sep - qpairsdec[j].str) : qpairsdec[j].len, false); fputs(",\n \"value\": ", stdout); jsonString(stdout, sep?value:"", sep?value_len:0, false); fputs("\n }", stdout); } fputs("\n ]", stdout); } fputs("\n }", stdout); } /* --trim query="utm_*" */ static bool trim(struct option *o) { bool query_is_modified = false; struct curl_slist *node; for(node = o->trim_list; node; node = node->next) { char *ptr = node->data; if(ptr) { /* 'ptr' should be a fixed string or a pattern ending with an asterisk */ size_t inslen; bool pattern = false; int i; char *temp = NULL; inslen = strlen(ptr); if(inslen) { pattern = ptr[inslen - 1] == '*'; if(pattern && (inslen > 1)) { pattern ^= ptr[inslen - 2] == '\\'; if(!pattern) { /* the two final letters are \*, but the backslash needs to be removed. Get a copy and edit that accordingly. */ temp = xstrdup(o, ptr); temp[inslen - 2] = '*'; temp[inslen - 1] = '\0'; ptr = temp; inslen--; /* one byte shorter now */ } } if(pattern) inslen--; } for(i = 0 ; i < nqpairs; i++) { char *q = qpairs[i].str; char *sep = strchr(q, '='); size_t qlen; if(sep) qlen = sep - q; else qlen = strlen(q); if((pattern && (inslen <= qlen) && !casecompare(q, ptr, inslen)) || (!pattern && (inslen == qlen) && !casecompare(q, ptr, inslen))) { /* this qpair should be stripped out */ free(qpairs[i].str); free(qpairsdec[i].str); qpairs[i].str = xstrdup(o, ""); /* marked as deleted */ qpairs[i].len = 0; qpairsdec[i].str = xstrdup(o, ""); /* marked as deleted */ qpairsdec[i].len = 0; query_is_modified = true; } } free(temp); } } return query_is_modified; } static char *decodequery(char *str, size_t len, int *olen) { /* handle '+' to ' ' outside of the libcurl call */ char *p = str; size_t plen = len; do { char *n = memchr(p, '+', plen); if(n) { *n = ' '; ++n; plen -= (n - p); } p = n; } while(p); return curl_easy_unescape(NULL, str, (int)len, olen); } /* the unusual thing here is that we let '*' remain as-is */ #define ISURLPUNTCS(x) (((x) == '-') || ((x) == '.') || ((x) == '_') || \ ((x) == '~') || ((x) == '*')) #define ISUPPER(x) (((x) >= 'A') && ((x) <= 'Z')) #define ISLOWER(x) (((x) >= 'a') && ((x) <= 'z')) #define ISDIGIT(x) (((x) >= '0') && ((x) <= '9')) #define ISALNUM(x) (ISDIGIT(x) || ISLOWER(x) || ISUPPER(x)) #define ISUNRESERVED(x) (ISALNUM(x) || ISURLPUNTCS(x)) static char *encodequery(char *str, size_t len) { /* to handle ' ' to '+' escaping we cannot use libcurl's URL encode function */ char *dupe = malloc(len * 3 + 1); /* worst case */ char *p = dupe; if(!p) return NULL; while(len--) { /* treat the characters unsigned */ unsigned char in = (unsigned char)*str++; if(in == ' ') *dupe++ = '+'; else if(ISUNRESERVED(in)) *dupe++ = in; else { /* encode it */ const char hex[] = "0123456789abcdef"; dupe[0]='%'; dupe[1] = hex[in>>4]; dupe[2] = hex[in & 0xf]; dupe += 3; } } *dupe = 0; return p; } /* URL decode, then URL encode it back to normalize. But don't touch the first '=' if there is one */ static struct string *memdupzero(char *source, size_t len, bool *modified) { struct string *ret = calloc(1, sizeof(struct string)); char *left = NULL; char *right = NULL; char *el = NULL; char *er = NULL; char *encode = NULL; if(!ret) return NULL; if(len) { char *sep = memchr(source, '=', len); int olen; if(!sep) { /* no '=' */ char *decode = decodequery(source, (int)len, &olen); if(decode) encode = encodequery(decode, olen); else goto error; curl_free(decode); } else { int llen; int rlen; int leftside; int rightside; /* decode both sides */ leftside = (int)(sep - source); if(leftside) { left = decodequery(source, leftside, &llen); if(!left) goto error; } else { left = NULL; llen = 0; } /* length on the right side of '=': */ rightside = (int)len - (int)(sep - source) - 1; if(rightside) { right = decodequery(sep + 1, (int)len - (int)(sep - source) - 1, &rlen); if(!right) goto error; } else { right = NULL; rlen = 0; } /* encode both sides again */ if(left) { el = encodequery(left, llen); if(!el) goto error; } if(right) { er = encodequery(right, rlen); if(!er) goto error; } encode = curl_maprintf("%s=%s", el ? el : "", er ? er : ""); if(!encode) goto error; } olen = (int)strlen(encode); if(((size_t)olen != len) || strcmp(encode, source)) *modified |= true; ret->str = encode; ret->len = olen; } curl_free(left); curl_free(right); free(el); free(er); return ret; error: curl_free(left); curl_free(right); free(el); free(er); free(encode); free(ret); return NULL; } /* URL decode the pair and return it in an allocated chunk */ static struct string *memdupdec(char *source, size_t len, bool json) { char *sep = memchr(source, '=', len); char *left = NULL; char *right = NULL; int right_len = 0; int left_len = 0; char *str; struct string *ret; left = strurldecode(source, (int)(sep ? (size_t)(sep - source) : len), &left_len); if(sep) { char *p; int plen; right = strurldecode(sep + 1, (int)(len - (sep - source) - 1), &right_len); /* convert null bytes to periods */ for(plen = right_len, p = right; plen; plen--, p++) { if(!*p && !json) { *p = REPLACE_NULL_BYTE; } } } str = malloc(sizeof(char) * (left_len + (sep?(right_len + 1):0))); if(!str) { curl_free(right); curl_free(left); return NULL; } memcpy(str, left, left_len); if(sep) { str[left_len] = '='; memcpy(str + 1 + left_len, right, right_len); } curl_free(right); curl_free(left); ret = malloc(sizeof(struct string)); if(!ret) { free(str); return NULL; } ret->str = str; ret->len = left_len + (sep?(right_len + 1):0); return ret; } static void freeqpairs(void) { int i; for(i = 0; istr; qpairs[nqpairs].len = p->len; qpairsdec[nqpairs].str = pdec->str; qpairsdec[nqpairs].len = pdec->len; nqpairs++; } } else warnf("too many query pairs"); if(pdec) free(pdec); if(p) free(p); return modified; } /* convert the query string into an array of name=data pair */ static bool extractqpairs(CURLU *uh, struct option *o) { char *q = NULL; bool modified = false; memset(qpairs, 0, sizeof(qpairs)); nqpairs = 0; /* extract the query */ if(!curl_url_get(uh, CURLUPART_QUERY, &q, 0)) { char *p = q; char *amp; while(*p) { size_t len; amp = strchr(p, o->qsep[0]); if(!amp) len = strlen(p); else len = amp - p; modified |= addqpair(p, len, o->jsonout); if(amp) p = amp + 1; else break; } } curl_free(q); return modified; } static void qpair2query(CURLU *uh, struct option *o) { int i; char *nq = NULL; for(i = 0; iqsep : "", qpairs[i].len ? qpairs[i].str : ""); curl_free(oldnq); } if(nq) { int rc = curl_url_set(uh, CURLUPART_QUERY, nq, 0); if(rc) trurl_warnf(o, "internal problem: failed to store updated query in URL"); } curl_free(nq); } /* sort case insensitively */ static int cmpfunc(const void *p1, const void *p2) { int i; int len = (int)((((struct string *)p1)->len) < (((struct string *)p2)->len)? (((struct string *)p1)->len) : (((struct string *)p2)->len)); for(i = 0; i < len; i++) { char c1 = ((struct string *)p1)->str[i] | ('a' - 'A'); char c2 = ((struct string *)p2)->str[i] | ('a' - 'A'); if(c1 != c2) return c1 - c2; } return 0; } static bool sortquery(struct option *o) { if(o->sort_query) { /* not these two lists may no longer be the same order after the sort */ qsort(&qpairs[0], nqpairs, sizeof(struct string), cmpfunc); qsort(&qpairsdec[0], nqpairs, sizeof(struct string), cmpfunc); return true; } return false; } static bool replace(struct option *o) { bool query_is_modified = false; struct curl_slist *node; for(node = o->replace_list; node; node = node->next) { struct string key; struct string value; bool replaced = false; int i; key.str = node->data; value.str = strchr(key.str, '='); if(value.str) { key.len = value.str++ - key.str; value.len = strlen(value.str); } else { key.len = strlen(key.str); value.str = NULL; value.len = 0; } for(i = 0; i < nqpairs; i++) { char *q = qpairs[i].str; /* not the correct query, move on */ if(strncmp(q, key.str, key.len)) continue; free(qpairs[i].str); free(qpairsdec[i].str); /* this is a duplicate remove it. */ if(replaced) { qpairs[i].len = 0; qpairs[i].str = xstrdup(o, ""); qpairsdec[i].len = 0; qpairsdec[i].str = xstrdup(o, ""); continue; } struct string *pdec = memdupdec(key.str, key.len + value.len + 1, o->jsonout); struct string *p = memdupzero(key.str, key.len + value.len + (value.str ? 1 : 0), &query_is_modified); qpairs[i].len = p->len; qpairs[i].str = p->str; qpairsdec[i].len = pdec->len; qpairsdec[i].str = pdec->str; free(pdec); free(p); query_is_modified = replaced = true; } if(!replaced && o->force_replace) { addqpair(key.str, strlen(key.str), o->jsonout); query_is_modified = true; } } return query_is_modified; } static CURLUcode seturl(struct option *o, CURLU *uh, const char *url) { return curl_url_set(uh, CURLUPART_URL, url, (o->no_guess_scheme ? 0 : CURLU_GUESS_SCHEME)| (o->curl ? 0 : CURLU_NON_SUPPORT_SCHEME)| (o->accept_space ? CURLU_ALLOW_SPACE : 0)| CURLU_URLENCODE); } static char *canonical_path(const char *path) { /* split the path per slash, URL decode + encode, then put together again */ size_t len = strlen(path); char *sl; char *dupe = NULL; do { char *opath; char *npath; char *ndupe; int olen; sl = memchr(path, '/', len); size_t partlen = sl ? (size_t)(sl - path) : len; if(partlen) { /* First URL decode the part */ opath = curl_easy_unescape(NULL, path, (int)partlen, &olen); if(!opath) return NULL; /* Then URL encode it again */ npath = curl_easy_escape(NULL, opath, olen); curl_free(opath); if(!npath) return NULL; ndupe = curl_maprintf("%s%s%s", dupe ? dupe : "", npath, sl ? "/": ""); curl_free(npath); } else if(sl) { /* zero length part but a slash */ ndupe = curl_maprintf("%s/", dupe ? dupe : ""); } else { /* no part, no slash */ break; } curl_free(dupe); if(!ndupe) return NULL; dupe = ndupe; if(sl) { path = sl + 1; len -= partlen + 1; } } while(sl); return dupe; } static void normalize_part(struct option *o, CURLU *uh, CURLUPart part) { char *ptr; size_t ptrlen = 0; (void)curl_url_get(uh, part, &ptr, 0); if(ptr) ptrlen = strlen(ptr); if(ptrlen) { int olen; char *uptr; /* First URL decode the component */ char *rawptr = curl_easy_unescape(NULL, ptr, (int)ptrlen, &olen); if(!rawptr) errorf(o, ERROR_MEM, "out of memory"); /* Then URL encode it again */ uptr = curl_easy_escape(NULL, rawptr, olen); curl_free(rawptr); if(!uptr) errorf(o, ERROR_MEM, "out of memory"); if(strcmp(ptr, uptr)) /* changed, store the updated one */ (void)curl_url_set(uh, part, uptr, 0); curl_free(uptr); } curl_free(ptr); } static void singleurl(struct option *o, const char *url, /* might be NULL */ struct iterinfo *iinfo, struct curl_slist *iter) { CURLU *uh = iinfo->uh; bool first_lap = true; if(!uh) { uh = curl_url(); if(!uh) errorf(o, ERROR_MEM, "out of memory"); if(url) { CURLUcode rc = seturl(o, uh, url); if(rc) { curl_url_cleanup(uh); verify(o, ERROR_BADURL, "%s [%s]", curl_url_strerror(rc), url); return; } if(o->redirect) { rc = seturl(o, uh, o->redirect); if(rc) { curl_url_cleanup(uh); verify(o, ERROR_BADURL, "invalid redirection: %s [%s]", curl_url_strerror(rc), o->redirect); return; } } } } do { struct curl_slist *p; bool url_is_invalid = false; bool query_is_modified = false; unsigned setmask = 0; /* set everything */ setmask = set(uh, o); if(iter) { char iterbuf[1024]; /* "part=item1 item2 item2" */ const char *part; size_t plen; const char *w; size_t wlen; char *sep; bool urlencode = true; const struct var *v; if(!iinfo->ptr) { part = iter->data; sep = strchr(part, '='); if(!sep) errorf(o, ERROR_ITER, "wrong iterate syntax"); plen = sep - part; if(sep[-1] == ':') { urlencode = false; plen--; } w = sep + 1; /* store for next lap */ iinfo->part = part; iinfo->plen = plen; v = comp2var(part, plen); if(!v) { curl_url_cleanup(uh); errorf(o, ERROR_ITER, "bad component for iterate"); } if(iinfo->varmask & (1<part)) { curl_url_cleanup(uh); errorf(o, ERROR_ITER, "duplicate component for iterate: %s", v->name); } if(setmask & (1 << v->part)) { curl_url_cleanup(uh); errorf(o, ERROR_ITER, "duplicate --iterate and --set for component %s", v->name); } } else { part = iinfo->part; plen = iinfo->plen; v = comp2var(part, plen); w = iinfo->ptr; } sep = strchr(w, ' '); if(sep) { wlen = sep - w; iinfo->ptr = sep + 1; /* next word is here */ } else { /* last word */ wlen = strlen(w); iinfo->ptr = NULL; } (void)curl_msnprintf(iterbuf, sizeof(iterbuf), "%.*s%s=%.*s", (int)plen, part, urlencode ? "" : ":", (int)wlen, w); setone(uh, iterbuf, o); if(iter->next) { struct iterinfo info; memset(&info, 0, sizeof(info)); info.uh = uh; info.varmask = iinfo->varmask | (1 << v->part); singleurl(o, url, &info, iter->next); } } if(first_lap) { /* extract the current path */ char *opath; char *cpath; bool path_is_modified = false; if(curl_url_get(uh, CURLUPART_PATH, &opath, 0)) errorf(o, ERROR_MEM, "out of memory"); /* append path segments */ for(p = o->append_path; p; p = p->next) { char *apath = p->data; char *npath; size_t olen; /* does the existing path end with a slash, then don't add one in between */ olen = strlen(opath); /* append the new segment */ npath = curl_maprintf("%s%s%s", opath, opath[olen-1] == '/' ? "" : "/", apath); curl_free(opath); opath = npath; path_is_modified = true; } cpath = canonical_path(opath); if(!cpath) errorf(o, ERROR_MEM, "out of memory"); if(strcmp(cpath, opath)) { /* updated */ path_is_modified = true; curl_free(opath); opath = cpath; } else curl_free(cpath); if(path_is_modified) { /* set the new path */ if(curl_url_set(uh, CURLUPART_PATH, opath, 0)) errorf(o, ERROR_MEM, "out of memory"); } curl_free(opath); normalize_part(o, uh, CURLUPART_FRAGMENT); normalize_part(o, uh, CURLUPART_USER); normalize_part(o, uh, CURLUPART_PASSWORD); normalize_part(o, uh, CURLUPART_OPTIONS); } query_is_modified |= extractqpairs(uh, o); /* trim parts */ query_is_modified |= trim(o); /* replace parts */ query_is_modified |= replace(o); if(first_lap) { /* append query segments */ for(p = o->append_query; p; p = p->next) { addqpair(p->data, strlen(p->data), o->jsonout); query_is_modified = true; } } /* sort query */ query_is_modified |= sortquery(o); /* put the query back */ if(query_is_modified) qpair2query(uh, o); /* make sure the URL is still valid */ if(!url || o->redirect || o->set_list || o->append_path) { char *ourl = NULL; CURLUcode rc = curl_url_get(uh, CURLUPART_URL, &ourl, 0); if(rc) { if(o->verify) /* only clean up if we're exiting */ curl_url_cleanup(uh); verify(o, ERROR_URL, "not enough input for a URL"); url_is_invalid = true; } else { rc = seturl(o, uh, ourl); if(rc) { if(o->verify) /* only clean up if we're exiting */ curl_url_cleanup(uh); verify(o, ERROR_BADURL, "%s [%s]", curl_url_strerror(rc), ourl); url_is_invalid = true; } else { char *nurl = NULL; rc = curl_url_get(uh, CURLUPART_URL, &nurl, 0); if(!rc) curl_free(nurl); else { if(o->verify) /* only clean up if we're exiting */ curl_url_cleanup(uh); verify(o, ERROR_BADURL, "url became invalid"); url_is_invalid = true; } } curl_free(ourl); } } if(iter && iter->next) ; else if(url_is_invalid) ; else if(o->jsonout) json(o, uh); else if(o->format) { /* custom output format */ get(o, uh); } else { /* default output is full URL */ char *nurl = NULL; int rc = geturlpart(o, 0, uh, CURLUPART_URL, &nurl); if(!rc) { printf("%s\n", nurl); curl_free(nurl); } } fflush(stdout); freeqpairs(); o->urls++; first_lap = false; } while(iinfo->ptr); if(!iinfo->uh) curl_url_cleanup(uh); } int main(int argc, const char **argv) { int exit_status = 0; struct option o; struct curl_slist *node; memset(&o, 0, sizeof(o)); setlocale(LC_ALL, ""); curl_global_init(CURL_GLOBAL_ALL); for(argc--, argv++; argc > 0; argc--, argv++) { bool usedarg = false; if(!o.end_of_options && argv[0][0] == '-') { /* dash-dash prefixed */ if(getarg(&o, argv[0], argv[1], &usedarg)) { /* if the long option ends with an equals sign, cut it there, if it is a short option, show just two letters */ size_t not_e = argv[0][1] == '-' ? strcspn(argv[0], "=") : 2; errorf(&o, ERROR_FLAG, "unknown option: %.*s", (int)not_e, argv[0]); } } else { /* this is a URL */ urladd(&o, argv[0]); } if(usedarg) { /* skip the parsed argument */ argc--; argv++; } } if(!o.qsep) o.qsep = "&"; if(o.jsonout) putchar('['); if(o.url) { /* this is a file to read URLs from */ char buffer[4096]; /* arbitrary max */ bool end_of_file = false; while(!end_of_file && fgets(buffer, sizeof(buffer), o.url)) { char *eol = strchr(buffer, '\n'); if(eol && (eol > buffer)) { if(eol[-1] == '\r') /* CRLF detected */ eol--; } else if(eol == buffer) { /* empty line */ continue; } else if(feof(o.url)) { /* end of file */ eol = strlen(buffer) + buffer; end_of_file = true; } else { /* line too long */ int ch; trurl_warnf(&o, "skipping long line"); do { ch = getc(o.url); } while(ch != EOF && ch != '\n'); if(ch == EOF) { if(ferror(o.url)) trurl_warnf(&o, "getc: %s", strerror(errno)); end_of_file = true; } continue; } /* trim trailing spaces and tabs */ while((eol > buffer) && ((eol[-1] == ' ') || eol[-1] == '\t')) eol--; if(eol > buffer) { /* if there is actual content left to deal with */ struct iterinfo iinfo; memset(&iinfo, 0, sizeof(iinfo)); *eol = 0; /* end of URL */ singleurl(&o, buffer, &iinfo, o.iter_list); } } if(!end_of_file && ferror(o.url)) trurl_warnf(&o, "fgets: %s", strerror(errno)); if(o.urlopen) fclose(o.url); } else { /* not reading URLs from a file */ node = o.url_list; do { if(node) { const char *url = node->data; struct iterinfo iinfo; memset(&iinfo, 0, sizeof(iinfo)); singleurl(&o, url, &iinfo, o.iter_list); node = node->next; } else { struct iterinfo iinfo; memset(&iinfo, 0, sizeof(iinfo)); o.verify = true; singleurl(&o, NULL, &iinfo, o.iter_list); } } while(node); } if(o.jsonout) printf("%s]\n", o.urls ? "\n" : ""); /* we're done with libcurl, so clean it up */ trurl_cleanup_options(&o); curl_global_cleanup(); return exit_status; } trurl-0.16.1/trurl.md0000664000000000000000000005764515010312005011375 0ustar00--- c: Copyright (C) Daniel Stenberg, , et al. SPDX-License-Identifier: curl Title: trurl Section: 1 Source: trurl 0.16.1 See-also: - curl (1) - wcurl (1) --- # NAME trurl - transpose URLs # SYNOPSIS **trurl [options / URLs]** # DESCRIPTION **trurl** parses, manipulates and outputs URLs and parts of URLs. It uses the RFC 3986 definition of URLs and it uses libcurl's URL parser to do so, which includes a few "extensions". The URL support is limited to "hierarchical" URLs, the ones that use `://` separators after the scheme. Typically you pass in one or more URLs and decide what of that you want output. Possibly modifying the URL as well. trurl knows URLs and every URL consists of up to ten separate and independent *components*. These components can be extracted, removed and updated with trurl and they are referred to by their respective names: scheme, user, password, options, host, port, path, query, fragment and zoneid. # NORMALIZATION When provided a URL to work with, trurl "normalizes" it. It means that individual URL components are URL decoded then URL encoded back again and set in the URL. Example: $ trurl 'http://ex%61mple:80/%62ath/a/../b?%2e%FF#tes%74' http://example/bath/b?.%ff#test # OPTIONS Options start with one or two dashes. Many of the options require an additional value next to them. Any other argument is interpreted as a URL argument, and is treated as if it was following a `--url` option. The first argument that is exactly two dashes (`--`), marks the end of options; any argument after the end of options is interpreted as a URL argument even if it starts with a dash. Long options can be provided either as `--flag argument` or as `--flag=argument`. ## -a, --append [component]=[data] Append data to a component. This can only append data to the path and the query components. For path, this URL encodes and appends the new segment to the path, separated with a slash. For query, this URL encodes and appends the new segment to the query, separated with an ampersand (&). If the appended segment contains an equal sign (`=`) that one is kept verbatim and both sides of the first occurrence are URL encoded separately. ## --accept-space When set, trurl tries to accept spaces as part of the URL and instead URL encode such occurrences accordingly. According to RFC 3986, a space cannot legally be part of a URL. This option provides a best-effort to convert the provided string into a valid URL. ## --as-idn Converts a punycode ASCII hostname to its original International Domain Name in Unicode. If the hostname is not using punycode then the original hostname is used. ## --curl Only accept URL schemes supported by libcurl. ## --default-port When set, trurl uses the scheme's default port number for URLs with a known scheme, and without an explicit port number. Note that trurl only knows default port numbers for URL schemes that are supported by libcurl. Since, by default, trurl removes default port numbers from URLs with a known scheme, this option is pretty much ignored unless one of *--get*, *--json*, and *--keep-port* is not also specified. ## -f, --url-file [filename] Read URLs to work on from the given file. Use the filename `-` (a single minus) to tell trurl to read the URLs from stdin. Each line needs to be a single valid URL. trurl removes one carriage return character at the end of the line if present, trims off all the trailing space and tab characters, and skips all empty (after trimming) lines. The maximum line length supported in a file like this is 4094 bytes. Lines that exceed that length are skipped, and a warning is printed to stderr when they are encountered. ## -g, --get [format] Output text and URL data according to the provided format string. Components from the URL can be output when specified as **{component}** or **[component]**, with the name of the part show within curly braces or brackets. You can not mix braces and brackets for this purpose in the same command line. The following component names are available (case sensitive): url, scheme, user, password, options, host, port, path, query, fragment and zoneid. **{component}** expands to nothing if the given component does not have a value. Components are shown URL decoded by default. URL decoding a component may cause problems to display it. Such problems make a warning get displayed unless **--quiet** is used. trurl supports a range of different qualifiers, or prefixes, to the component that changes how it handles it: If **url:** is specified, like `{url:path}`, the component gets output URL encoded. As a shortcut, `url:` also works written as a single colon: `{:path}`. If **strict:** is specified, like `{strict:path}`, URL decode problems are turned into errors. In this stricter mode, a URL decode problem makes trurl stop what it is doing and return with exit code 10. If **must:** is specified, like `{must:query}`, it makes trurl return an error if the requested component does not exist in the URL. By default a missing component will just be shown blank. If **default:** is specified, like `{default:url}` or `{default:port}`, and the port is not explicitly specified in the URL, the scheme's default port is output if it is known. If **puny:** is specified, like `{puny:url}` or `{puny:host}`, the punycoded version of the hostname is used in the output. This option is mutually exclusive with **idn:**. If **idn:** is specified like `{idn:url}` or `{idn:host}`, the International Domain Name version of the hostname is used in the output if it is provided as a correctly encoded punycode version. This option is mutually exclusive with **puny:**. If *--default-port* is specified, all formats are expanded as if they used *default:*; and if *--punycode* is used, all formats are expanded as if they used *puny:*. Also note that `{url}` is affected by the *--keep-port* option. Hosts provided as IPv6 numerical addresses are provided within square brackets. Like `[fe80::20c:29ff:fe9c:409b]`. Hosts provided as IPv4 numerical addresses are *normalized* and provided as four dot-separated decimal numbers when output. You can access specific keys in the query string using the format **{query:key}**. Then the value of the first matching key is output using a case sensitive match. When extracting a URL decoded query key that contains `%00`, such octet is replaced with a single period `.` in the output. You can access specific keys in the query string and out all values using the format **{query-all:key}**. This looks for *key* case sensitively and outputs all values for that key space-separated. The *format* string supports the following backslash sequences: \\ - backslash \\t - tab \\n - newline \\r - carriage return \\{ - an open curly brace that does not start a variable \\[ - an open bracket that does not start a variable All other text in the format string is shown as-is. ## -h, --help Show the help output. ## --iterate [component]=[item1 item2 ...] Set the component to multiple values and output the result once for each iteration. Several combined iterations are allowed to generate combinations, but only one *--iterate* option per component. The listed items to iterate over should be separated by single spaces. Example: $ trurl example.com --iterate=scheme="ftp https" --iterate=port="22 80" ftp://example.com:22/ ftp://example.com:80/ https://example.com:22/ https://example.com:80/ ## --json Outputs all set components of the URLs as JSON objects. All components of the URL that have data get populated in the parts object using their component names. See below for details on the format. The URL components are provided URL decoded. Change that with **--urlencode**. ## --keep-port By default, trurl removes default port numbers from URLs with a known scheme even if they are explicitly specified in the input URL. This options, makes trurl not remove them. Example: $ trurl https://example.com:443/ --keep-port https://example.com:443/ ## --no-guess-scheme Disables libcurl's scheme guessing feature. URLs that do not contain a scheme are treated as invalid URLs. Example: $ trurl example.com --no-guess-scheme trurl note: Bad scheme [example.com] ## --punycode Uses the punycode version of the hostname, which is how International Domain Names are converted into plain ASCII. If the hostname is not using IDN, the regular ASCII name is used. Example: $ trurl http://åäö/ --punycode http://xn--4cab6c/ ## --qtrim [what] Trims data off a query. *what* is specified as a full name of a name/value pair, or as a word prefix (using a single trailing asterisk (`*`)) which makes trurl remove the tuples from the query string that match the instruction. To match a literal trailing asterisk instead of using a wildcard, escape it with a backslash in front of it. Like `\\*`. ## --query-separator [what] Specify the single letter used for separating query pairs. The default is `&` but at least in the past sometimes semicolons `;` or even colons `:` have been used for this purpose. If your URL uses something other than the default letter, setting the right one makes sure trurl can do its query operations properly. Example: $ trurl "https://curl.se?b=name:a=age" --sort-query --query-separator ":" https://curl.se/?a=age:b=name ## --quiet Suppress (some) notes and warnings. ## --redirect [URL] Redirect the URL to this new location. The redirection is performed on the base URL, so, if no base URL is specified, no redirection is performed. Example: $ trurl --url https://curl.se/we/are.html --redirect ../here.html https://curl.se/here.html ## --replace [data] Replaces a URL query. data can either take the form of a single value, or as a key/value pair in the shape *foo=bar*. If replace is called on an item that is not in the list of queries trurl ignores that item. trurl URL encodes both sides of the `=` character in the given input data argument. ## --replace-append [data] Works the same as *--replace*, but trurl appends a missing query string if it is not in the query list already. ## -s, --set [component][:]=[data] Set this URL component. Setting blank string (`""`) clears the component from the URL. The following components can be set: url, scheme, user, password, options, host, port, path, query, fragment and zoneid. If a simple `=`-assignment is used, the data is URL encoded when applied. If `:=` is used, the data is assumed to already be URL encoded and stored as-is. If `?=` is used, the set is only performed if the component is not already set. It avoids overwriting any already set data. You can also combine `:` and `?` into `?:=` if desired. If no URL or *--url-file* argument is provided, trurl tries to create a URL using the components provided by the *--set* options. If not enough components are specified, this fails. ## --sort-query The "variable=content" tuplets in the query component are sorted in a case insensitive alphabetical order. This helps making URLs identical that otherwise only had their query pairs in different orders. ## --trim [component]=[what] Deprecated: use **--qtrim**. Trims data off a component. Currently this can only trim a query component. *what* is specified as a full word or as a word prefix (using a single trailing asterisk (`*`)) which makes trurl remove the tuples from the query string that match the instruction. To match a literal trailing asterisk instead of using a wildcard, escape it with a backslash in front of it. Like `\\*`. ## --url [URL] Set the input URL to work with. The URL may be provided without a scheme, which then typically is not actually a legal URL but trurl tries to figure out what is meant and guess what scheme to use (unless *--no-guess-scheme* is used). Providing multiple URLs makes trurl act on all URLs in a serial fashion. If the URL cannot be parsed for whatever reason, trurl simply moves on to the next provided URL - unless *--verify* is used. ## --urlencode Outputs URL encoded version of components by default when using *--get* or *--json*. ## -v, --version Show version information and exit. ## --verify When a URL is provided, return error immediately if it does not parse as a valid URL. In normal cases, trurl can forgive a bad URL input. # URL COMPONENTS ## scheme This is the leading character sequence of a URL, excluding the "://" separator. It cannot be specified URL encoded. A URL cannot exist without a scheme, but unless **--no-guess-scheme** is used trurl guesses what scheme that was intended if none was provided. Examples: $ trurl https://odd/ -g '{scheme}' https $ trurl odd -g '{scheme}' http $ trurl odd -g '{scheme}' --no-guess-scheme trurl note: Bad scheme [odd] ## user After the scheme separator, there can be a username provided. If it ends with a colon (`:`), there is a password provided. If it ends with an at character (`@`) there is no password provided in the URL. Example: $ trurl https://user%3a%40:secret@odd/ -g '{user}' user:@ ## password If the password ends with a semicolon (`;`) there is an options field following. This field is only accepted by trurl for URLs using the IMAP scheme. Example: $ trurl https://user:secr%65t@odd/ -g '{password}' secret ## options This field can only end with an at character (`@`) that separates the options from the hostname. $ trurl 'imap://user:pwd;giraffe@odd' -g '{options}' giraffe If the scheme is not IMAP, the `giraffe` part is instead considered part of the password: $ trurl 'sftp://user:pwd;giraffe@odd' -g '{password}' pwd;giraffe We strongly advice users to %-encode `;`, `:` and `@` in URLs of course to reduce the risk for confusions. ## host The host component is the hostname or a numerical IP address. If a hostname is provided, it can be an International Domain Name non-ASCII characters. A hostname can be provided URL encoded. trurl provides options for working with the IDN hostnames either as IDN or in its punycode version. Example, convert an IDN name to punycode in the output: $ trurl http://åäö/ --punycode http://xn--4cab6c/ Or the reverse, convert a punycode hostname into its IDN version: $ trurl http://xn--4cab6c/ --as-idn http://åäö/ If the URL's hostname starts with an open bracket (`[`) it is a numerical IPv6 address that also must end with a closing bracket (`]`). trurl normalizes IPv6 addreses. Example: $ trurl 'http://[2001:9b1:0:0:0:0:7b97:364b]/' http://[2001:9b1::7b97:364b]/ A numerical IPV4 address can be specified using one, two, three or four numbers separated with dots and they can use decimal, octal or hexadecimal. trurl normalizes provided addresses and uses four dotted decimal numbers in its output. Examples: $ trurl http://646464646/ http://38.136.68.134/ $ trurl http://246.646/ http://246.0.2.134/ $ trurl http://246.46.646/ http://246.46.2.134/ $ trurl http://0x14.0xb3022/ http://20.11.48.34/ ## zoneid If the provided host is an IPv6 address, it might contain a specific zoneid. A number or a network interface name normally. Example: $ trurl 'http://[2001:9b1::f358:1ba4:7b97:364b%enp3s0]/' -g '{zoneid}' enp3s0 ## port If the host ends with a colon (`:`) then a port number follows. It is a 16 bit decimal number that may not be URL encoded. trurl knows the default port number for many URL schemes so it can show port numbers for a URL even if none was explicitly used in the URL. With **--default-port** it can add the default port to a URL even when not provide. Example: $ trurl http:/a --default-port http://a:80/ Similarly, trurl normally hides the port number if the given number is the default. Example: $ trurl http:/a:80 http://a/ But a user can make trurl keep the port even if it is the default, with **--keep-port**. Example: $ trurl http:/a:80 --keep-port http://a:80/ ## path A URL path is assumed to always start with and contain at least a slash (`/`), even if none is actually provided in the URL. Example: $ trurl http://xn--4cab6c -g '[path]' / When setting the path, trurl will inject a leading slash if none is provided: $ trurl http://hello -s path="pony" http://hello/pony $ trurl http://hello -s path="/pony" http://hello/pony If the input path contains dotdot or dot-slash sequences, they are normalized away. Example: $ trurl http://hej/one/../two/../three/./four http://hej/three/four You can append a new segment to an existing path with **--append** like this: $ trurl http://twelve/three?hello --append path=four http://twelve/three/four?hello ## query The query part does not include the leading question mark (`?`) separator when extracted with trurl. Example: $ trurl http://horse?elephant -g '{query}' elephant Example, if you set the query with a leading question mark: $ trurl http://horse?elephant -s "query=?elephant" http://horse/?%3felephant Query parts are often made up of a series of name=value pairs separated with ampersands (`&`), and trurl offers several ways to work with such. Append a new name value pair to a URL with **--append**: $ trurl http://host?name=hello --append query=search=life http://host/?name=hello&search=life You cam **--replace** the value of a specific existing name among the pairs: $ trurl 'http://alpha?one=real&two=fake' --replace two=alsoreal http://alpha/?one=real&two=alsoreal If the specific name you want to replace perhaps does not exist in the URL, you can opt to replace *or* append the pair: $ trurl 'http://alpha?one=real&two=fake' --replace-append three=alsoreal http://alpha/?one=real&two=fake&three=alsoreal In order to perhaps compare two URLs using query name value pairs, sorting them first at least increases the chances of it working: $ trurl "http://alpha/?one=real&two=fake&three=alsoreal" --sort-query http://alpha/?one=real&three=alsoreal&two=fake Remove name/value pairs from the URL by specifying exact name or wildcard pattern with **--qtrim**: $ trurl 'https://example.com?a12=hej&a23=moo&b12=foo' --qtrim a*' https://example.com/?b12=foo ## fragment The fragment part does not include the leading hash sign (`#`) separator when extracted with trurl. Example: $ trurl http://horse#elephant -g '{fragment}' elephant Example, if you set the fragment with a leading hash sign: $ trurl "http://horse#elephant" -s "fragment=#zebra" http://horse/#%23zebra The fragment part of a URL is for local purposes only. The data in there is never actually sent over the network when a URL is used for transfers. ## url trurl supports **url** as a named component for **--get** to allow for more powerful outputs, but of course it is not actually a "component"; it is the full URL. Example: $ trurl ftps://example.com:2021/p%61th -g '{url}' ftps://example.com:2021/path # JSON output format The *--json* option outputs a JSON array with one or more objects. One for each URL. Each URL JSON object contains a number of properties, a series of key/value pairs. The exact set present depends on the given URL. ## url This key exists in every object. It is the complete URL. Affected by *--default-port*, *--keep-port*, and *--punycode*. ## parts This key exists in every object, and contains an object with a key for each of the settable URL components. If a component is missing, it means it is not present in the URL. The parts are URL decoded unless *--urlencode* is used. ## parts.scheme The URL scheme. ## parts.user The username. ## parts.password The password. ## parts.options The options. Note that only a few URL schemes support the "options" component. ## parts.host The normalized hostname. It might be a UTF-8 name if an IDN name was used. It can also be a normalized IPv4 or IPv6 address. An IPv6 address always starts with a bracket (**[**) - and no other hostnames can contain such a symbol. If *--punycode* is used, the punycode version of the host is outputted instead. ## parts.port The provided port number as a string. If the port number was not provided in the URL, but the scheme is a known one, and *--default-port* is in use, the default port for that scheme is provided here. ## parts.path The path. Including the leading slash. ## parts.query The full query, excluding the question mark separator. ## parts.fragment The fragment, excluding the pound sign separator. ## parts.zoneid The zone id, which can only be present in an IPv6 address. When this key is present, then **host** is an IPv6 numerical address. ## params This key contains an array of query key/value objects. Each such pair is listed with "key" and "value" and their respective contents in the output. The key/values are extracted from the query where they are separated by ampersands (**&**) - or the user sets with **--query-separator**. The query pairs are listed in the order of appearance in a left-to-right order, but can be made alpha-sorted with **--sort-query**. It is only present if the URL has a query. # EXAMPLES ## Replace the hostname of a URL ~~~ $ trurl --url https://curl.se --set host=example.com https://example.com/ ~~~ ## Create a URL by setting components ~~~ $ trurl --set host=example.com --set scheme=ftp ftp://example.com/ ~~~ ## Redirect a URL ~~~ $ trurl --url https://curl.se/we/are.html --redirect here.html https://curl.se/we/here.html ~~~ ## Change port number This also shows how trurl removes dot-dot sequences ~~~ $ trurl --url https://curl.se/we/../are.html --set port=8080 https://curl.se:8080/are.html ~~~ ## Extract the path from a URL ~~~ $ trurl --url https://curl.se/we/are.html --get '{path}' /we/are.html ~~~ ## Extract the port from a URL This gets the default port based on the scheme if the port is not set in the URL. ~~~ $ trurl --url https://curl.se/we/are.html --get '{default:port}' 443 ~~~ ## Append a path segment to a URL ~~~ $ trurl --url https://curl.se/hello --append path=you https://curl.se/hello/you ~~~ ## Append a query segment to a URL ~~~ $ trurl --url "https://curl.se?name=hello" --append query=search=string https://curl.se/?name=hello&search=string ~~~ ## Read URLs from stdin ~~~ $ cat urllist.txt | trurl --url-file - \&... ~~~ ## Output JSON ~~~ $ trurl "https://fake.host/search?q=answers&user=me#frag" --json [ { "url": "https://fake.host/search?q=answers&user=me#frag", "parts": [ "scheme": "https", "host": "fake.host", "path": "/search", "query": "q=answers&user=me" "fragment": "frag", ], "params": [ { "key": "q", "value": "answers" }, { "key": "user", "value": "me" } ] } ] ~~~ ## Remove tracking tuples from query ~~~ $ trurl "https://curl.se?search=hey&utm_source=tracker" --qtrim "utm_*" https://curl.se/?search=hey ~~~ ## Show a specific query key value ~~~ $ trurl "https://example.com?a=home&here=now&thisthen" -g '{query:a}' home ~~~ ## Sort the key/value pairs in the query component ~~~ $ trurl "https://example.com?b=a&c=b&a=c" --sort-query https://example.com?a=c&b=a&c=b ~~~ ## Work with a query that uses a semicolon separator ~~~ $ trurl "https://curl.se?search=fool;page=5" --qtrim "search" --query-separator ";" https://curl.se?page=5 ~~~ ## Accept spaces in the URL path ~~~ $ trurl "https://curl.se/this has space/index.html" --accept-space https://curl.se/this%20has%20space/index.html ~~~ ## Create multiple variations of a URL with different schemes ~~~ $ trurl "https://curl.se/path/index.html" --iterate "scheme=http ftp sftp" http://curl.se/path/index.html ftp://curl.se/path/index.html sftp://curl.se/path/index.html ~~~ # EXIT CODES trurl returns a non-zero exit code to indicate problems. ## 1 A problem with --url-file ## 2 A problem with --append ## 3 A command line option misses an argument ## 4 A command line option mistake or an illegal option combination. ## 5 A problem with --set ## 6 Out of memory ## 7 Could not output a valid URL ## 8 A problem with --qtrim ## 9 If --verify is set and the input URL cannot parse. ## 10 A problem with --get ## 11 A problem with --iterate ## 12 A problem with --replace or --replace-append # WWW https://curl.se/trurl trurl-0.16.1/version.h0000664000000000000000000000216015010312005011517 0ustar00#ifndef TRURL_VERSION_H #define TRURL_VERSION_H /*************************************************************************** * _ _ ____ _ * Project ___| | | | _ \| | * / __| | | | |_) | | * | (__| |_| | _ <| |___ * \___|\___/|_| \_\_____| * * Copyright (C) Daniel Stenberg, , et al. * * This software is licensed as described in the file COPYING, which * you should have received as part of this distribution. The terms * are also available at https://curl.se/docs/copyright.html. * * You may opt to use, copy, modify, merge, publish, distribute and/or sell * copies of the Software, and permit persons to whom the Software is * furnished to do so, under the terms of the COPYING file. * * This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY * KIND, either express or implied. * * SPDX-License-Identifier: curl * ***************************************************************************/ #define TRURL_VERSION_TXT "0.16.1" #endif trurl-0.16.1/winbuild/0000755000000000000000000000000015010312005011475 5ustar00trurl-0.16.1/winbuild/.vcpkg0000644000000000000000000000032515010312005012610 0ustar00# git clone https://github.com/microsoft/vcpkg.git # .\vcpkg\bootstrap-vcpkg.bat .\vcpkg\vcpkg install curl:x86-windows-static-md .\vcpkg\vcpkg install curl:x64-windows-static-md .\vcpkg\vcpkg integrate install trurl-0.16.1/winbuild/README.md0000644000000000000000000000553315010312005012762 0ustar00 # Building trurl with Microsoft C++ Build Tools Download and install [Microsoft C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/) When installing, choose the `Desktop development with C++` option. ## Open a command prompt Open the **x64 Native Tools Command Prompt for VS 2022**, or if you are on an x86 platform **x86 Native Tools Command Prompt for VS 2022** ## Set up the vcpkg repository Note: The location of the vcpkg repository does not necessarily need to correspond to the trurl directory, it can be set up anywhere. But it is recommended to use a short path such as `C:\src\vcpkg` or `C:\dev\vcpkg`, since otherwise you may run into path issues for some port build systems. Once you are in the console, run the below commands to clone the vcpkg repository and set up the curl dependencies: ~~~ git clone https://github.com/microsoft/vcpkg.git .\vcpkg\bootstrap-vcpkg.bat .\vcpkg\vcpkg install curl:x86-windows-static-md .\vcpkg\vcpkg install curl:x64-windows-static-md .\vcpkg\vcpkg integrate install ~~~ Once the vcpkg repository is set up you do not need to run these commands again. If a newer version of curl is released, you may need to run `git pull` in the vcpkg repository and then `vcpkg upgrade` to fetch the new version. ## Build in the console Once the vcpkg repository and dependencies are set up, go to the winbuild directory in the trurl sources: cd trurl\winbuild Then you can call the build command with the desired parameters. The builds will be placed in an output directory as described below. ## Parameters - The `Configuration` parameter can be set to either `Debug` or `Release` - The `Platform` parameter can be set to either `x86` or `x64` ## Build commands - x64 Debug: `msbuild /m /t:Clean,Build /p:Configuration=Debug /p:Platform=x64 trurl.sln` - x64 Release: `msbuild /m /t:Clean,Build /p:Configuration=Release /p:Platform=x64 trurl.sln` - x86 Debug: `msbuild /m /t:Clean,Build /p:Configuration=Debug /p:Platform=x86 trurl.sln` - x86 Release: `msbuild /m /t:Clean,Build /p:Configuration=Release /p:Platform=x86 trurl.sln` Note: If you are using the x64 Native Tools Command Prompt you can also run the x86 build commands. ## Output directories The output files will be placed in: `winbuild\bin\\\` PDB files will be generated in the same directory as the executable for Debug builds, but they will not be generated for release builds. Intermediate files will be placed in: `winbuild\obj\\\` These include build logs and obj files. ## Tests Tests can be run by going to the directory of the output files in the console and running `perl .\..\..\..\..\test.pl` You will need perl installed to run the tests, such as [Strawberry Perl](https://strawberryperl.com/) trurl-0.16.1/winbuild/trurl.sln0000644000000000000000000000262615010312005013371 0ustar00 Microsoft Visual Studio Solution File, Format Version 12.00 # Visual Studio Version 17 VisualStudioVersion = 17.5.33516.290 MinimumVisualStudioVersion = 10.0.40219.1 Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "trurl", "trurl.vcxproj", "{575657CF-843F-491C-B15B-881C28DF36CA}" EndProject Global GlobalSection(SolutionConfigurationPlatforms) = preSolution Debug|x64 = Debug|x64 Debug|x86 = Debug|x86 Release|x64 = Release|x64 Release|x86 = Release|x86 EndGlobalSection GlobalSection(ProjectConfigurationPlatforms) = postSolution {575657CF-843F-491C-B15B-881C28DF36CA}.Debug|x64.ActiveCfg = Debug|x64 {575657CF-843F-491C-B15B-881C28DF36CA}.Debug|x64.Build.0 = Debug|x64 {575657CF-843F-491C-B15B-881C28DF36CA}.Debug|x86.ActiveCfg = Debug|Win32 {575657CF-843F-491C-B15B-881C28DF36CA}.Debug|x86.Build.0 = Debug|Win32 {575657CF-843F-491C-B15B-881C28DF36CA}.Release|x64.ActiveCfg = Release|x64 {575657CF-843F-491C-B15B-881C28DF36CA}.Release|x64.Build.0 = Release|x64 {575657CF-843F-491C-B15B-881C28DF36CA}.Release|x86.ActiveCfg = Release|Win32 {575657CF-843F-491C-B15B-881C28DF36CA}.Release|x86.Build.0 = Release|Win32 EndGlobalSection GlobalSection(SolutionProperties) = preSolution HideSolutionNode = FALSE EndGlobalSection GlobalSection(ExtensibilityGlobals) = postSolution SolutionGuid = {14A4D782-313F-4F61-A2C5-EF2CD877D3F3} EndGlobalSection EndGlobal trurl-0.16.1/winbuild/trurl.vcxproj0000644000000000000000000002211115010312005014257 0ustar00 Debug Win32 Release Win32 Debug x64 Release x64 16.0 Win32Proj {575657cf-843f-491c-b15b-881c28df36ca} trurl 10.0 Application true v143 Unicode Application false v143 true Unicode Application true v143 Unicode Application false v143 true Unicode $(SolutionDir)bin\$(Platform)\$(Configuration)\ $(SolutionDir)obj\$(Platform)\$(Configuration)\ $(SolutionDir)bin\$(Platform)\$(Configuration)\ $(SolutionDir)obj\$(Platform)\$(Configuration)\ $(SolutionDir)bin\$(PlatformShortName)\$(Configuration)\ $(SolutionDir)obj\$(PlatformShortName)\$(Configuration)\ $(SolutionDir)bin\$(PlatformShortName)\$(Configuration)\ $(SolutionDir)obj\$(PlatformShortName)\$(Configuration)\ true true true x64-windows true true x64-windows true true x86-windows true true x86-windows Level4 true WIN32;_DEBUG;_CONSOLE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions) true Console true ws2_32.lib;wldap32.lib;advapi32.lib;crypt32.lib;Normaliz.lib Level4 true true true WIN32;NDEBUG;_CONSOLE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions) true Console true true false ws2_32.lib;wldap32.lib;advapi32.lib;crypt32.lib;Normaliz.lib Level4 true _DEBUG;_CONSOLE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions) true Console true ws2_32.lib;wldap32.lib;advapi32.lib;crypt32.lib;Normaliz.lib Level4 true true true NDEBUG;_CONSOLE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions) true Console true true false ws2_32.lib;wldap32.lib;advapi32.lib;crypt32.lib;Normaliz.lib trurl-0.16.1/winbuild/vcpkg-configuration.json0000644000000000000000000000046215010312005016351 0ustar00{ "default-registry": { "kind": "git", "baseline": "43401f5835f97f48180724bdeb49a8e4a994b848", "repository": "https://github.com/microsoft/vcpkg" }, "registries": [ { "kind": "artifact", "location": "https://aka.ms/vcpkg-ce-default", "name": "microsoft" } ] } trurl-0.16.1/winbuild/vcpkg.json0000644000000000000000000000011615010312005013500 0ustar00{ "name": "trurl", "version": "0.x", "dependencies": [ "curl" ] }