Pricefield | Lemmy
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
@[email protected] to [email protected]English • 4 days ago

The Open-Source Software Saving the Internet From AI Bot Scrapers

www.404media.co

external-link
message-square
24
fedilink
  • cross-posted to:
  • [email protected]
  • [email protected]
146
external-link

The Open-Source Software Saving the Internet From AI Bot Scrapers

www.404media.co

@[email protected] to [email protected]English • 4 days ago
message-square
24
fedilink
  • cross-posted to:
  • [email protected]
  • [email protected]
Anubis, which block AI scrapers from scraping websites to death, has been downloaded almost 200,000 times.
  • who
    link
    fedilink
    English
    1•3 days ago

    Interesting. Judging by that option’s name, it seems to refer to use of the HTML <meta> tag to refresh a page.

    https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/meta/http-equiv

    Neither this tag nor using it for refresh is new at all. I don’t think I’ve seen it used to detect bots, though. I wonder what Anubis is doing here.

    • JohnEdwa
      link
      fedilink
      2•3 days ago

      It’s simply checking if the connection is from an actual browser, as a scraper pretending to be one won’t actually refresh the page as instructed. It’s going to buy some time, but like the rest of Anubis in general, it will only work until the scrapers get modified to work around it.

[email protected]

[email protected]
Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

  • Free and Open Source Software
  • Programming
  • Operating Systems

This community’s icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

  • 28 users / day
  • 186 users / week
  • 514 users / month
  • 2.17K users / 6 months
  • 2 subscribers
  • 3.94K Posts
  • 68.1K Comments
  • Modlog
  • mods:
  • alyaza [they/she]
  • TheRtRevKaiser
  • @[email protected]
  • Leigh
  • @[email protected]
  • rs5th
  • TheRtRevKaiser
  • Chris Remington
  • UI: 0.18.4
  • BE: 0.18.2
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org