Pricefield | Lemmy
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
@[email protected]MB to Hacker [email protected]English • 2 years ago

Indexing a Billion Pages

blog.mwmbl.org

message-square
0
fedilink
  • cross-posted to:
  • [email protected]
8
external-link

Indexing a Billion Pages

blog.mwmbl.org

@[email protected]MB to Hacker [email protected]English • 2 years ago
message-square
0
fedilink
  • cross-posted to:
  • [email protected]
Indexing a billion pages
blog.mwmbl.org
external-link
It’s two years since we launched Mwmbl, the open source, non-profit search engine, on Boxing Day 2021. A good time to take stock of where we are and where we’re going. We’ve indexed over 100 million pages Thanks to our volunteers, who crawl the web using the Firefox extension and command line script, we’re crawling up to a million pages a day, as you can see on our stats page. There are around 50-60 users crawling on an average day.

There is a discussion on Hacker News, but feel free to comment here as well.

alert-triangle
You must log in or register to comment.

Hacker [email protected]

[email protected]
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

This community serves to share top posts on Hacker News with the wider fediverse.

Rules
  1. Keep it legal
  2. Keep it civil and SFW
  3. Keep it safe for members of marginalised groups
  • 1 user / day
  • 1 user / week
  • 1 user / month
  • 1 user / 6 months
  • 1 subscriber
  • 16K Posts
  • 10.3K Comments
  • Modlog
  • mods:
  • @[email protected]
  • UI: 0.18.4
  • BE: 0.18.2
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org