Giter Site home page Giter Site logo

Comments (7)

 avatar commented on June 16, 2024

Another example on the same website that doesn't do anything when using SelectSingleNode()

Trying to get the highest page number (here "11"):

afbeelding

Using F12 in Mozilla Firefox, I found it should be in there using the XPath:
/html/body/div[3]/div[2]/div[6]/div[1]/div/div[1]/div/div/div[2]/a[3]

The code below doesn't do anything. It also worked on another website. It should print "11"

HtmlWeb web = new HtmlWeb();

HtmlAgilityPack.HtmlDocument document = web.Load("https://www.trancepodcasts.com/a-dream-radio/");

HtmlNode node = document.DocumentNode.SelectSingleNode("/html/body/div[3]/div[2]/div[6]/div[1]/div/div[1]/div/div/div[2]/a[3]");
if (node != null)
{
    Debug.WriteLine(node.InnerText);
}

I checked and "node" is null in the code example. Why is it null?


I tried on another website, and there similar code works: https://www.markusschulz.com/category/gdjb/gdjbtracklists/
It prints "100"

afbeelding

HtmlWeb web = new HtmlWeb();

HtmlAgilityPack.HtmlDocument document = web.Load("https://www.markusschulz.com/category/gdjb/gdjbtracklists/");

HtmlNode node = document.DocumentNode.SelectSingleNode("/html/body/div[3]/div/div/div[3]/div/div/div[3]/a[7]");
if (node != null)
{
    Debug.WriteLine(node.InnerText);
}

from html-agility-pack.

 avatar commented on June 16, 2024

Why does nothing seem to work on https://www.trancepodcasts.com/ while on the other example website it works (using similar code)? Did I find a bug?

from html-agility-pack.

JonathanMagnan avatar JonathanMagnan commented on June 16, 2024

Hello @trance-babe ,

A little bit like your other issue, what appears on the screen is not what has been loaded by HAP.

If you check the source, you don't find this HTML code: view-source:https://www.trancepodcasts.com/

The HTML code look more like this:

<ul class="sub-menu">
	<li class="menu-item menu-item-type-taxonomy menu-item-object-category menu-item-126821"><a href="https://www.trancepodcasts.com/a-dream-radio/">A Dream Radio</a></li>
	<li class="menu-item menu-item-type-taxonomy menu-item-object-category menu-item-104457"><a href="https://www.trancepodcasts.com/a-state-of-trance/">A State Of Trance</a></li>
	<li class="menu-item menu-item-type-taxonomy menu-item-object-category menu-item-106361"><a href="https://www.trancepodcasts.com/a-world-into-trance/">A World Into Trance</a></li>
...code...
</ul>

Nothing works on this website due to having dynamic HTML or HTML modified after the page is loaded

Best Regards,

Jon

from html-agility-pack.

 avatar commented on June 16, 2024

Nothing works on this website due to having dynamic HTML or HTML modified after the page is loaded

So I can't use HtmlAgilityPack at all here? Or can I use it on the HTML you gave me?

The other project you talked about: can you tell me what nuget package I need to install and can I use that library for this specific website?
Is there a code example doing the same as what I try to do using HtmlAgilityPack ?

from html-agility-pack.

JonathanMagnan avatar JonathanMagnan commented on June 16, 2024

HAP is more used to parse HTML than to work with dynamic HTML.

I believe you are asking for Selenium Web Browser: https://riptutorial.com/selenium-webdriver/learn/100000/overview

You can find how to setup in the tutorial we make years ago: https://riptutorial.com/selenium-webdriver/learn/100001/setup-selenium

Unfortunately, my time doesn't permit me to help you with it. However, using ChatGPT should get you started.

Best Regards,

Jon

from html-agility-pack.

 avatar commented on June 16, 2024

I don't find ANY Examples and I don't know chatgpt...

Can you please give a very short code example that is doing this?

If I use Selenium Web Browser: do I need to use the HTML as seen in the F12 window of my webbrowser or the HTML as seen in the get source code of my web browser? (right click)

HtmlWeb web = new HtmlWeb();

HtmlAgilityPack.HtmlDocument document = web.Load("https://www.trancepodcasts.com/");

HtmlNode node = document.DocumentNode.SelectSingleNode("//ul[class='sub-menu mm-listview']");

if (node != null)
{
    // Select all a nodes within the ul
    HtmlNodeCollection links = node.SelectNodes(".//li");

    if (links != null)
    {
        foreach (HtmlNode link in links)
        {
            Console.WriteLine(link.OuterHtml);
        }
    }
}

from html-agility-pack.

JonathanMagnan avatar JonathanMagnan commented on June 16, 2024

I'm really sorry,

I would like to help you more, but as I already said, my time is very limited (I barely have time to myself at this moment). Using selenium is another tool to learn. In short, it opens a new browser, and you can interact with it.

The first thing you should learn in this case is ChatGPT (or other alternative), and even ask your parent to provide a paid subscription: https://chat.openai.com/

This becomes a day-to-day tool for anybody who can use and take advantage of it. It's pretty simple to use; this is a Chat box that provides information (it might be bad or good, but for any common subject, it's accurate enough).

Best Regards,

Jon

from html-agility-pack.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.