How would one approach indexing pages for a search engine?

I Cast Fist@programming.dev · 1 year ago

How would one approach indexing pages for a search engine?

key@lemmy.keychat.org · 1 year ago

Links is the main way, sites that aren’t at all mentioned on the internet often aren’t worth indexing. That’s why site maps and tools to submit your website to major search engines peaked in the 00s. But if you really want everything you could always subscribe to lists of newly registered domains and create rules to to scrape them repeatedly with exponential backoff.

marsara9@lemmy.world · 1 year ago

You’re search engine would have to be told about that site some other way.

I’m not sure if you can anymore, but at least years ago you could register your site with Google that way it could find it without other links to your site being present.

sevenism@lemmy.ml · 1 year ago

I found this an interesting read https://www.marginalia.nu/log/63-marginalia-crawler/ There’s lots of posts about the development of his search engine