• boonhet@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 days ago

    Does cloudflare still look at the agent? I thought they have more reliable data points.

    • auraithx@piefed.social
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      I meant an ai agent not the browser agent. All data points can be spoofed and if not they’ll pay a human to scrape before they pay for content.

      • boonhet@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 days ago

        Okay, fair enough, I thought you meant just the user agent. Trouble with having a bot make it look like an actual user is looking at the data, is that it’s slow and inefficient. Trouble with paying humans to scrape the data is that it’s slow and inefficient. These companies want to ingest data ridiculously fast because there’s so much of it. If all else fails, they’ll resort to paying the content creators. But only if it’s data they really do think gives their model a competitive edge in some metric and they can’t pirate it. E.g I can see them paying for scientific research they can’t get from libgen, but not some rando’s blog post or local news website.