sounds very similar to google's program of scanning thousands of literary books into their system and making them searchable. I think google's currently tied up in court on that one with the publishers complaining about ccopyright issues.
would be great though to have results from researched books rather than websites, think you'd need to have two seperate searh portals though one for books one for the web.
|