Giter Site home page Giter Site logo

benibela / xidel Goto Github PK

View Code? Open in Web Editor NEW
657.0 26.0 39.0 2.59 MB

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.

Home Page: http://www.videlibri.de/xidel.html

License: GNU General Public License v3.0

Pascal 56.34% Shell 22.16% CSS 0.35% JavaScript 8.43% XQuery 4.76% HTML 2.84% PHP 0.03% Perl 0.08% Makefile 0.13% Roff 1.39% Hack 3.48%
xquery xml html json xpath cli command-line http web rest

xidel's Introduction

Xidel Build Status

Xidel is a command line tool to download and extract data from HTML/XML pages using CSS selectors, XPath/XQuery 3.0, as well as querying JSON files or APIs (e.g. REST) using JSONiq.

There are dependency-free binaries for Windows, Linux and Mac.

It is a wrapper around my Pascal Internet Tools (see repository internettools), so it supports XPath 2.0, XPath 3.0, XQuery 1.0, XQuery 3.0, JSONiq, CSS selectors and my own extensions/languages (e.g. pattern matching) and if you can compile that project, you can compile Xidel.

A simple example to return the titles of all pages linked by some starting page:

 xidel http://example.org --follow //a --extract //title

or simpler

 xidel http://example.org -f //a -e //title

The language can be explicitly chosen. For example

 xidel input.html --css 'a'
 xidel input.html --xpath '//a/@href'
 xidel input.html --xquery 'for $var in //a order by $var return $var'

returns all links, the target URI of each link or the text of all links alphabetically.

There are more examples on the above page with binaries, the github wiki and in the directory examples.

Screenshots

Xidel on Linux Xidel on Windows

Compilation and Installation

You can compile it by calling build.sh and install it by calling build.sh -t. Alternatively you can compile it with the Lazarus IDE.

You can call the commands from the .travis.yml script to download dependencies.

xidel's People

Contributors

benibela avatar ctrlcctrlv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

xidel's Issues

android binary

hello there
can you please provide an android binary? nothing special just plain arm self contained binary
i may have sent you a message before, im not sure
i have busybox, aria2 and perl all i need is xidel :)
thank you for your work. it really does wonders
cheers

android binary 64bit

are you still releasing android binaries ?
it seems newer androids are 64bit

CANNOT LINK EXECUTABLE: "/data/data/com.termux/files/usr/lib/libtermux-exec.so" is 64-bit instead of 32-bit
page record for 0xf727a04c was not found (block_size=64)

is it hard to setup lazarus and cross-compile from linux i386 to arm64?

User Guide

Apart from the readme file that provides some insight as to what can be done with the tool, there does not seem to be user guide.

e.g. What does the extract function do ?
What are the parameters to this function meant to be ?

XPath and XQuery 3.1?

Hi Benito,

A stack overflow question led me to install xidel (via homebrew) and take it for a spin. It's really nice work - I'm enjoying running XQuery very rapidly on remote URLs via the command line.

I'm wondering if you have any plans to support XPath and XQuery 3.1 functions? I found myself reaching for fn:sort, for example. If there's already partial support, do you have a list of implemented/unimplemented functions? Are there any other caveats about the language level supported in xidel? Have you have run xqts on the XQuery engine?

Thanks!
Joe

Build error: Can't find unit FLREUnicode

Hi -
I'm trying to build from source on FreeBSD and I'm running into an error:

--> ./build.sh
-Fu../../../components/pascal/* -Fi../../../components/pascal -Fu../../../components/pascal/internet/* -Fu../../../components/pascal/internet/examples/* -Fu../../../components/pascal/system/* -Fu../../../components/pascal/system/cvirus/* -Fu../../../components/pascal/system/apihooking/* -Fu../../../components/pascal/system/apihooking/example/* -Fu../../../components/pascal/system/nvcanvas/* -Fu../../../components/pascal/import/regexpr/source/* -Fu../../../components/pascal/import/synapse/* -Fi../../../components/pascal/import/synapse -Fu../../../components/pascal/import/flre/src/* -Fu../../../components/pascal/import/utf8tools/* -Fi../../../components/pascal/import/utf8tools -Fu../../../components/pascal/import/utf8tools/demo/charenc/* -Fu../../../components/pascal/import/utf8tools/demo/charandscan/* -Fu../../../components/pascal/data/* -Fi../../../components/pascal/data -Fu../../../components/pascal/data/examples/* -Fu../../../components/pascal/data/tests/* -Fu../../../components/lazarus/internet/sendBackError/* -Fu../../../components/lazarus/dialogs/* -Fu../../../programs/internet/xidel/* -Fi../../../programs/internet/xidel -Fu../../../programs/internet/xidel/android/*
Free Pascal Compiler version 3.0.2 [2017/12/10] for x86_64
Copyright (c) 1993-2017 by Florian Klaempfl and others
Target OS: FreeBSD for x86-64
Compiling xidel.pas
Compiling xidelbase.pas
Compiling /usr/home/bridger/src/xidel/xidel-0.9.6-src/components/pascal/data/simplexmltreeparserfpdom.pas
simplexmltreeparserfpdom.pas(149,19) Warning: Implicit string type conversion with potential data loss from "WideString" to "AnsiString"
simplexmltreeparserfpdom.pas(150,23) Warning: Implicit string type conversion with potential data loss from "WideString" to "AnsiString"
Compiling /usr/home/bridger/src/xidel/xidel-0.9.6-src/components/pascal/data/xquery_module_file.pas
xquery_module_file.pas(400,15) Warning: Symbol "urlHexDecode" is deprecated: "for internal use"
Compiling /usr/home/bridger/src/xidel/xidel-0.9.6-src/components/pascal/data/xquery_module_math.pas
xquery_module_math.pas(244,3) Note: Local variable "f" not used
Compiling /usr/home/bridger/src/xidel/xidel-0.9.6-src/components/pascal/system/rcmdline.pas
Compiling /usr/home/bridger/src/xidel/xidel-0.9.6-src/components/pascal/data/xquery_utf8.pas
xquery_utf8.pas(33,30) Fatal: Can't find unit FLREUnicode used by xquery_utf8
Fatal: Compilation aborted
Error: /usr/local/bin/ppcx64 returned an error exitcode

I've been able to work around other build errors by grabbing things piecemeal (first time playing around with a Pascal application) but I can't find a FLREUnicode.pas anywhere. Could you give me some suggestions for getting around this particular hurdle?

And: Xidel looks like an awesome tool - thanks for sharing it!

Best,
Bridger

Edit: so, I clearly didn't look hard enough in the TAR, there was a copy under components/pascal/import/flre/. After trying to puzzle out why the build.sh couldn't/wouldn't load that directory, I copied the two files from that path into the directory with build.sh and it completed without any additional errors. Apologies for the confusion!

can't run on rasperry pi2 (armhf)

hi,
i downloaded and extracted the binary from xidel-0.9.6.linuxarm.tar.gz
when i run it, however, i get:

$ ls -l
-rwxr-xr-x 1 teo teo 6603687 Jan  8  2017 xidel
$ ./xidel --version
-bash: ./xidel: No such file or directory

what am i missing?

Compiling xidel using cross compiler for mips platform ?

Hello.

How to compile the xidel binary for the mips platform using this cross-compiler http://www.codesourcery.com/sgpp/lite/mips/portal/package4432/public/mips-linux-gnu/mips-4.3-154-mips- linux-gnu-i686-pc-linux-gnu.tar.bz2 ? I use debian and I need to build a binary file for the Dune HD media player, which is based on mips. Simple build examples using gcc or make do not cause problems, but xidel uses FreePascal and I don’t know how to build a binary :(

Need to install free pascal? Package "fpc" ? And use build.sh with command line switches ? What are the command line switches ?

Sorry, I'm newbie

Cookies not preserved across redirects

I'm loading a HTML page from disk and having xidel submit the form in the page. This form generates a HTTP POST against a server which returns a HTTP 302 redirect and a Set-Cookie header. xidel is following this redirect but is not sending the cookie it received on the redirect response.

Internet Error: -4 under any non-system shell when using ssh on Android 4.2.2

Hi all.

I get the following error under any non-system shell when using ssh on Android 4.2.2

Error:
Internet Error: -4
when talking to: ... (any site, http, https)

The site can be any - it is both http and https. I got this error on standalone 4.3 bash and on bash 4.4 from entware.

Linux localhost 3.4.5 #1 SMP PREEMPT Thu Mar 27 16:19:17 CST 2014 armv7l GNU/Linux
Xidel 0.9.9 (20200201.7173.9d381d1545ec)
dropbear from SSHDroid 2.1.0

P.S. I can use strace on this device

How to build all this?

So xidel package is not in any Linux repository. At least it's missed in Ubnutu-breed.
Ok, let's build it. Unfortunately you didn't provide ANY instructions ANYWHERE on how to do this.

I tried to run build.sh and manage.sh scripts but they just failed.

Specifically, build.sh failes wih error:

xidel.pas(33,6) Fatal: Can't find unit internetaccess used by xidel

while manage.sh cannot find some file which it assumes to be... outside of the repo:

./manage.sh: line 5: /home/user/src/xidel/../../../manageUtils.sh: No such file or directory

Would you please tell us how to do it?

Xidel Does Not Honor HTML <base> Tag when Following or Resolving Links

$ xidel --version
Xidel 0.9.6
(20161120.5245.ead1b6fb3d7b)

If I try something like

xidel 'http://example.com/example01.html' \
  -e '/html/head/title' \
  -f '//a[@id eq "next_page"]'

and the document contains a <base href="..."> tag, then xidel appears to ignores the provided base URL and thusly fails to resolve the correct link β€” it instead it resolves the link using the current URL as the base.

It would be neat if this could be fixed, but in the meantime there is a simple workaround.

xidel 'http://example.com/example01.html' \
  -e '/html/head/title' \
  -f 'fn:resolve-uri(//a[@id eq "next_page"]/@href,/html/head/base/@href)'

two similiar, valid xml files, different results

What works:

$ xidel --data=https://repo.tokenscript.org/aw.app/2020/06/unicon.tsml --extract 'count(//*:contract)'
1

What doesn't work:

$ git clone [email protected]:AlphaWallet/TokenScript-Examples.git
$ cd TokenScript-Examples/examples/edcon
$ xidel --data=unicon.xml --extract 'count(//*:contract)'
0

The two files (unicon.tsml and unicon.xml) are both valid XML files (against one same schema) with identical content except that tsml files are canonicalised (and signed), while unicon.xml is not.

I haven't been able to narrow down this to the exact point of minimal-diffierence to cause the failure but I'll report this first.

XPath bug

On the demo website, for this XML,

<tag id="r">
  <tag id="a">value</tag>
  <tag id="b">val<br/>ue</tag>
  <tag id="c"><span>val</span><span>ue</span></tag>
  <tag id="f"> value</tag>
  <tag id="g">Value</tag>
</tag>

this XPath, //tag[.="value"], selects these nodes,

<tag id="a">value</tag>
<tag id="b">val<br/>ue</tag>
<tag id="c"><span>val</span><span>ue</span></tag>
<tag id="f"> value</tag>
<tag id="g">Value</tag>

but should select these nodes:

<tag id="a">value</tag>
<tag id="b">val<br/>ue</tag>
<tag id="c"><span>val</span><span>ue</span></tag>

question about usage, not issue

i have a question about usage, not an issue
so here it goes:
does xidel offer a way to get xpath output?

for example, xpath of elements containing some strings?

im making a script to do this stuff
but i was wondering if there is a faster way

problem with regex & match-contains

xidel 2.htm --output-encoding=input -e "<script><t:match-text contains="mapInfo"/><template:read source="text()" var="regex" regex='[1-9]\d*.\d*|0.\d*[1-9]\d*'/></script>"
**** Retrieving: 2.htm ****
**** Processing: 2.htm ****
Error:
Matching of template failed. for an unknown reason

The 2.htm contains javascript like this:
<script type="text/javascript" language="javascript">
var mapInfo={zoom:15,mapZoom:15,px:"113.9609603881836",py:"22.5539608001709",isKey:"1"};

Unable to add node with transform() function

Hello again.
I don't know whether I am misunderstanding xidel/xquery again or if this is a bug.

I need to manipulate a given XML, which works flawlessly for existing nodes:

xidel infile.xml --xml --xquery "transform(/, function(\$e) { \
                          if (name(\$e) = \"surveyls_title\" ) then <surveyls_title>$newsid: $stitle</surveyls_title>
                          else \$e })

However, I'd like to add a node before a given position, say before <b>:

$ echo "<a><b></b></a>" | xidel --xml - --xquery "transform(/, function(\$e) { if (name(\$e) = \"b\") then (
<cannotinsertthistag/>
> \$e) else \$e })"
<?xml version="1.0" encoding="UTF-8"?>
**** Processing: stdin:/// ****
Error:
err:XPST0003: Unknown or unexpected operator: $ (possible missing comma , or closing parentheses)}] )
in: transform(/, function($e) { if (name($e) = "b") then (
<cannotinsertthistag/>
 [<- error occurs before here] $e) else $e })

I also tried building a variable before (with concat() or string-join()) to no avail (brackets get encoded as &lt; and similar).
Is this possible at all and I am only doing it in the wrong way or does xidel not support this?

Please advise, thanks!

PS:
Btw: The online documentation at http://www.videlibri.de/xidel.html seems to be outdated.
The example given:

xidel --html your-file.html --xquery 'transform(/, function($e) { 
   $e / if (name() = "a") then 
           <a style="{join((@style, "font-weight: bold"), "; ")}">{@* except @style, node()}</a> 
        else .
})' > your-output-file.html

produces an error:

Error:
err:XPST0003: Reserved function name: if
in: transform(/, function($e) { 
   $e / if ( [<- error occurs before here] name() = "a") then 
           <a style="{join((@style, "font-weight: bold"), "; ")}">{@* except @style, node()}</a> 
        else .
})

Xidelscript (JS) remote test get "500" error

When using remote test in Xidelscript (js), a window with the error number "500" is displayed :

An error has been encountered in accessing this page.

  1. Server: videlibri.sourceforge.net
  2. URL path: /cgi-bin/xidelcgi
  3. Error notes: malformed header from script 'xidelcgi': Bad header: [
  4. Error type: 500
  5. Request method: POST
  6. Request query string:
  7. Time: 2019-11-16 05:42:20 UTC (1573882940)

Reporting this problem: The problem you have encountered is with a project web site hosted by SourceForge.net. This issue should be reported to the SourceForge.net-hosted project (not to SourceForge.net).

If this is a severe or recurring/persistent problem, please do one of the following, and provide the error text (numbered 1 through 7, above):

  1. Contact the project via their designated support resources.
  2. Contact the project administrators of this project via email (see the upper right-hand corner of the Project Summary page for their usernames) at [email protected]

If you are a maintainer of this web content, please refer to the Site Documentation regarding web services for further assistance.

NOTE: As of 2008-10-23 directory index display has been disabled by default. This option may be re-enabled by the project by placing a file with the name ".htaccess" with this line:

Options +Indexes

Need help building JSON from variables

I am experimenting with combining different extraction methods, in this case CSS selectors, templates and Xpath/Xquery (using Powershell). Following this answer on Stackoverflow, it seems variables (in my case, "$header", "$time", "$length" and "$author") can be written to JSON by using the file:write-text and serialize-json functions; however I cannot get them to work.

I know JSON formatting can be written directly in multipage templates, but I am specifically trying to combine different extraction methods through Powershell. My question then is, how can I build a JSON file using Xidel using the following script:

.\xidel links.html -f //a
-e "header:=css('div.section-content div h1')"
-e '<time datetime={$time}></time>'
-e '<span class="readingTime" title={$length}></span>'
-e author:='distinct-values(//a[@data-user-id])'

Intermittent file open errors

Thanks for this great tool, running multiple xidel instances at the same time (operating on the same set of files) as part of a system test, we sometimes get the error:

$ xidel -s --xpath3 "/*" XXX
An unhandled exception occurred at $000000000047DF55:
EFOpenError: Unable to open file "XXX"
  $000000000047DF55
  $000000000047DD54
  $000000000045E9E2

Could the problem be that the files are not opened in READ ONLY mode?

Bug: Xidel tries to parse input despite asking for $raw

$ xidel -s - -e '$raw' <<EOF
[Title]

blah-blah-blah
EOF
Error:
err:FOJS0001: error at Title (tkIdentifier) in [Title]
[...]

With -e '$json' I get the same error-message (which is expected), but with -e '$raw' I was actually expecting just:

[Title]

blah-blah-blah

If this is indeed a bug, and if you don't want to change Xidel's default parsing behavior, would something like --input-format=raw be an option?

I found out about this when doing...

$ curl -s https://raw.githubusercontent.com/Reino17/xivid/master/xivid_notes.txt | sed -n 2948,3010p | xidel -s - -e '$json'

or locally $ sed -n 2948,3010p xivid_notes.txt | xidel -s - -e '$json'.
This works just fine, but when I open the file or url directly with Xidel...

$ xidel -s https://raw.githubusercontent.com/Reino17/xivid/master/xivid_notes.txt -e 'x:lines($raw)[position() = 2948 to 3010]'
Error:
err:FOJS0001: error at Reeksen (tkIdentifier) in [Reeksen en positionering]
[...]

Only file:read-text-lines() / unparsed-text-lines() works so far:

$ xidel -s -e 'json(join(file:read-text-lines("xivid_notes.txt")[position() = 2948 to 3010]))'
$ xidel -s -e 'json(join(unparsed-text-lines("https://raw.githubusercontent.com/Reino17/xivid/master/xivid_notes.txt")[position() = 2948 to 3010]))'

Not working if CSS class contains underscores

E.g. scrape movie description from IMDb:
-e 'div.plot_summary div.summary_text' gives:

Error:
err:XPST0003: Need whitespace after operator
in: div.plot_summary div [<- error occurs before here] .summary_text

Also any hint how to select non-HTML tags like itemprop="description" used in IMDb site?

[Feature request] --download param

Hi Benito,

Could you make it so that when using --download without a given '=filename.ext', uses the content-disposition value of the header as the filename to the current folder? If the file already exists in the filesystem return error for that.

$headers example:

Content-Disposition: attachment; filename="the-meant-filename.zip"; filename*=UTF-8''the-meant-filename.zip

Or use some special char for that...
ie. in combo with a given foldername, and also overwrite if file exists...

--download "foldername\*"

and/or... do you already have a smart workaround (for the time being)? :-)

Many thanx!

Is there some mechanism to decode HTML charset

Symptom : xidel cannot treat charset correctly.
Example: xidel.exe --html 5.htm -e "//div[@Class='qxName']/a" --stdin-encoding=oem >5o.htm --output-encoding=input

This 5.htm has , but the outpupt is incorrect.

After I convert the 5.htm file to UTF8 encoded, the output is fine. It seems xidel always treat input file as UTF8 , am I missed something? or xidel can only work like this .

fetching data is very slow

Hi, I compared fetching the same data with xidel vs curl | xidel - and the difference is significant.

fish__Users_daniel

( Btw, is there a way to supress the fetching and processing ? it's weird it's piped to stderr since it's not an error )

template:optional="true" with null output feature ?

The sample template :

<div class="list rel"> <dl class="plotListwrap clearfix"> <dd> ... </dd> </dl> <div class="listRiconwrap"> <p template:optional="true" class="priceAverage"> <span>{$price := .}</span> </p> <p template:optional="true" class="ratio"> <span>{$ratio:= .}</span> </p> </div> </div>+
within the outer loop, if <p template:optional="true" class="priceAverage"> exists, everything ok. if not exists, the result output will miss one element.

Is there some xidel-attr can control: if not exist then output as null ?

new --delete feature

Like --extract, but the opposite.

<div>
    <span>I want to keep this</span>
    <div class="I_want_to_delete_this">
        <span>blah< blah/span>
    </div>
    <span>I want to keep this too</span>
</div>

With --delete "//div[@class='I_want_to_delete_this']" I would expect to end up with:

<div>
    <span>I want to keep this</span>
    <span>I want to keep this too</span>
</div>

err:XPTY0004 invalid conversion to type singleton - xpath limitation ?

i got 2 separate selects working but concat of those doesent
there are more of the first outer div, this is the node that repeats html/body/div but i only left one for simplicity
i understand it may be a xpath limitation,

what can i do?
i read somewhere xidel supports some kind of script for/foreach iterating over nodes? but i cant find it

or transforming the children in sibling, is this even possible, would css selectors be easier?
or xquery?

any help is greatly appreciated

this is the html

<html>
<body>

<div>
<span class="a">
	33
	<div>kg</div>
	234
	<div>m</div>
</span>

<span class="b">
	44
	<div>kg</div>
	345
	<div>m</div>
	5678
	<div>l</div>
</span>
</div>

</body>
</html>

this is the desired output

33|kg
234|m
44|kg
345|m
5678|l

xidel t2.html -e "html/body/div/span/text()"

33
234

44
345
5678

xidel t2.html -e "html/body/div/span/div/text()"

kg
m
kg
m
l

but
xidel t2.html -e "html/body/div/concat(span/text(),'|',span/div/text())"

Error:
err:XPTY0004: Invalid conversion from (33
, 
234
, 
, 
44
, 
345
, 
5678
, 
) to type singleton
Q{http://www.benibela.de/2012/pxp/extensions}concat((
33
, 
234
, 
, 
44
, 
345
, 
5678
, 
), "|", (kg, m, kg, m, l))

thank you for your help

xidel skips the commented "<!-- abc -->" html

Hi all.

xidel skips the commented data "" in html. I have to use sed and remove the escape characters "" to parse the data in html via xidel. Is it possible to do this without using sed ?

can i use xquery flwor with xidel?

i tried something simple but i dont know how to read the file, this doesnt work

for $a in doc("books.xml")//author
order by $a/last, $a/first
return $a/last

License

According SF.net project's page, this code is released under GPLv2, yet a "LICENSE.{md|txt}" file is missing. Please add (choosealicense.com).

Allow escaping of quotes

I see myself often want to use something like this:

xidel foo.htm -e 'join(//something, "\",\""')
                                   --^^-^^--note these

Basically joining stuff together in csv format which requires quoted strings. However: xidel fails when using doublequotes there so the escaping does not work.
Is there a workaround or is there a chance that escaping gets added to xidel?

PS: Awesome tool!

HTTPS sites do not work with an HTTPS proxy

I have a proxy with trustedproxies.com . HTTPS sites work fine with this proxy when using curl, wget, or Apache HTTP Components.

Xidel works fine using the same proxy over HTTP

Reproduction:

xidel --proxy='ip:port' https://example.com -e //title

and it waits a while then...

**** Retrieving:https://example.com ****
An unhandled exception occurred at $0010757D :
EInternetException : Connecting failed
when talking to: https://example.com/
$0010757D
$00107C38
$00026C5D
$00026AD8
$000271D5
$000253E6
$0002508E
$00011834
$0003AF28
$0003ABE0
$0003C741
$0004028E
$0004E60D

The above is from Xidel 0.8, but 0.9.6 has the same issue

POSTing JSON data impossible/needs double curly brackets

I am trying to send JSON data to an API.

user@host:~$ xidel -H "Content-Type: application/json" -d '{"method":"get_session_key"}' $url -e "result" --verbose
Error:
err:XPST0003: "}" expected, but ":" found
in: {"method": [<- error occurs before here] "get_session_key"}

I was unable to use the json-serialize() function to do the actualy JSON structuring, but I am a complete newbie to xpath/xquery and xidel. Maybe that's not possible at all.

Sending JSON however is possible by using double curly brackets.

I don't know if this is intended - at least it is not documented. But this works:

user@host:~$ xidel -H "Content-Type: application/json" -d '{{"method":"get_session_key"}}' $url -e "result" --verbose

What is the intended way of sending/posting JSON data and parsing the result again (in bash)?

map:merge() returns duplicate keys

echo '{"a":1,"b":2,"c":3}' | xidel -s - -e 'map:merge(($json,{"c":4}),{"duplicates":"use-last"})'
{
  "a": 1,
  "b": 2,
  "c": 3,
  "c": 4
}

Bug? I know development of XQuery 3.1 functions is still in progress, but this shouldn't happen.

Extract correct item from HTML when page contains identical classes

E.g. HTML structure:

<div class="parent">
      <a class="item">Item 1</a>
      <a class="item">Item 2</a>
</div>

<div class="parent">
      <h4 class="unique-identifier">Extract what's under:</h4>
      <a class="item">Item 3</a>
      <a class="item">Item 4</a>
</div>

As you can see HTML has identical parent items all over, but only one parent item has h4 tag inside, which can be somehow used as unique identifier? So that only items under that parent would be extracted Item 3 and Item 4.

If I run -e 'div.parent a' it extracts all items from all parents, but we want only items from within the parent that also has h4 child.

Also why if you add class to 'a' element, it doesn't extract? E.g. -e 'div.parent a.item' doesn't work.

Prepend baseurl before following

Is there a way to prepend the baseurl before every link before following? In my case I have relative links in a json file which I want to follow. I need to add the url to the links before I can follow them.

Is this even possible? Thats my command so far:

xidel file.json -f '$json()["url"]' -e '//html'

Can you scroll a page using Xidel?

Hey,

I'm trying to extract the links episode links on (for instance) https://puhutv.com/siyah-inci-detay, but some of the links aren't shown in the HTML code until you've scrolled down (the links will therefore not be detected by XideL), is it possible to add scrolling to Xidel?

Tried to find information about it in the documentation without luck so far.

Specify a file as XQuery

Thanks for the nice xpath/xquery tool! Helped me to find the right input-files for testing stuff at work.

Can I specify a file on the command-line to serve as the XQuery script? And how is this done?

r6739: Colouring regression

xidel-0 9 9 6739_colouring-regression1
Screen text becomes grey in case of an error.

xidel-0 9 9 6739_colouring-regression2
Even when there's no error does the screen text become grey.

post request not working

I need to send a post request. I try so doing:

xidel "http://annuaire.cncc.fr/index.php?page=liste&p=1" --method POST -d "recherche=1&nature_fiche=toutes&nom_societe=&cp_ville=&crcc=0&x=34&y=18" -e '//html'

That doenst work. Also when I split up the patameters:

xidel "http://annuaire.cncc.fr/index.php" --method POST -d "recherche=1" -d "nature_fiche=toutes" -d "nom_societe=" -d "cp_ville=" -d "crcc=0" -d "x=34" -d "y=18" -e '//html'

it always prints **** Retrieving (GET): http://annuaire.cncc.fr/index.php ****.
How do I convince xidel to do a post request instead?

Internet Error: -4

% xidel https://www.archlinux.org/packages/extra/x86_64/firefox/ -e //title
Error:
Internet Error: -4 
when talking to: https://www.archlinux.org/packages/extra/x86_64/firefox/

xidel openssl error messages on androidarm64 (termux)

xidel produces the correct result but also outputs a bunch of error messages. This started to happen after termux's openssl update to 1.1.1b-3. Before this openssl upgrade, there were no error messages.

termux 0.68

Xidel 0.9.9
(20190104.6739.b64562007cb7)
xidel-0.9.9.20190104.6739.b64562007cb7.androidarm64

openssl 1.1.1b-3 aarch64

$ xidel --data="https://feeds.twit.tv/twig.xml" --extract='head(//rss/channel/item/enclosure/@url)'
**** Retrieving (GET): https://feeds.twit.tv/twig.xml ****
**** Processing: https://feeds.twit.tv/twig.xml ****
https://www.podtrac.com/pts/redirect.mp3/cdn.twit.tv/audio/twig/twig0505/twig0505.mp3
An unhandled exception occurred at $0000007BE68C0200:
EAccessViolation: Access violation
$0000007BE68C0200
$0000007BE7B0BC24
$0000007BE7AFD9E4
$0000007BE7AF76E4
$0000007BE7AF74BC
$0000007BE7AF3130
$0000007BE78F51BC
$00000057FA6F8294
$00000057FA53CAE8
$00000057FA6F80B0 FREELIBRARY, line 114 of ../../../components/pascal/import/synapse/synafpc.pas
$00000057FA6FFA78 DESTROYSSLINTERFACE, line 2097 of ../../../components/pascal/import/synapse/ssl_openssl_lib.pas
$00000057FA6FFFB4 SSL_OPENSSL_LIB_$$_finalize$, line 2223 of ../../../components/pascal/import/synapse/ssl_openssl_lib.pas
$00000057FA539E98
$00000057FA53A26C
$00000057FA53A28C
$00000057FA529ED4 main, line 98 of xidel.pas
$0000007BE79C2E00

Also:

$ curl -s -L https://feeds.twit.tv/twig.xml | xidel - --extract='head(//rss/channel/item/enclosure/@url)'
**** Processing: stdin:/// ****
https://www.podtrac.com/pts/redirect.mp3/cdn.twit.tv/audio/twig/twig0505/twig0505.mp3

Thanks for this powerful tool! I use it more and more.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.